Technical Transformation Deep Dive
There are two types of transformation - technical and functional. This post focuses on technical transformation and the capabilities of Data Prep to address common technical transformation requirements.


Technical Transformation Deep Dive
March 12, 2025
Intro
In our article Workiva Data Prep Overview we outlined that Data Prep is a part of the Data Management Suite with a primary function of transforming data to make it report ready. There are two types of transformation - technical and functional. This post focuses on technical transformation and the capabilities of Data Prep to address common technical transformation requirements. We will explore the functional transformation capabilities of Data Prep in a future article.
Please note, this article is rather lengthy given the nature of exploring the capabilities of multiple transformations. While we welcome you reading the article in its entirety, each transformation overview is written to be standalone so feel free to skip sections that are not relevant to you at this time.
Defining Technical Transformation
Before continuing, it is best to define technical transformation. Transformation broadly is the process to change a data file. When we refer to technical transformation, we are describing the actions that generally change the structure of the data more so than the content. While technical transformation may change content, typically we think of content changes as functional transformation.
Exploring Technical Transformation Capabilities
Workiva’s Data Prep application has a number of technical transformation capabilities. These are organized into the following categories:
Each of the Transform transformations is documented on the Workiva Help Center. In this article, we explore the use case for key transformations.
Copy Column
Data lineage is the concept that we can trace data from its original value to its final state after all transformations have been applied. Data Prep transformations applied to a column/field in the data are performed inline meaning that the original value is modified and by default there is no way to trace the final transformed result back to the original value. The Copy Column transformation can be used to create this type of lineage.
To create a data lineage, use the Copy Column transformation. The transformation populates the new column with all of the values of the column from which the new column was copied. Applying this transformation before any functional transformation allows the before and after states to be available thereby creating a lineage.
Flip Sign
The Flip Sign transformation is useful when integrating financial data from a general ledger system. General Ledgers apply accounting logic where liability, equity, and revenue account balances are stored as negative numbers. For example, if you had $45,652.78 of sales for a given time period the G/L extract would provide a value of -45652.78. While this is correct from an accounting perspective, financial reporting would want to show this value as a positive number to indicate positive sales results.
The Flip Sign transformation is a simple and elegant way to transform data from accounting standards to financial reporting standards. Using the transformation, you can evaluate fields in the data to determine if a sign flip is applicable to a given row. For example, the natural account follows a numbering convention where all liability accounts begin with the number two, equity with three, and revenue with four. The transformation would be configured to flip the sign for the amount field based on the account field beginning with 2, 3, or 4.
Formula
The Formula transformation can be very useful. It enables a spreadsheet formula to perform a transformation that is not otherwise available as a predefined transformation. The functions supported by the Formula transformation are documented on the Workiva Help Center. Functions are able to be nested in the formula in the same way it can be done in a spreadsheet.
We would like to offer a word of caution about the Formula transformation. As we noted, Data Prep was built with the business user in mind. Spreadsheet formulas are familiar and in the absence of a data management application like Data Prep, formulas likely have been employed to manage transformation needs. We encourage Data Prep users to resist the urge to simply apply a formula to manage a transformation and instead employ any of the core transformation functionality available before leveraging a Formula transformation. Let’s evaluate a few scenarios where the perceived need to use a Formula can actually be addressed by core functionality.
Scenario 1: Date Logic
In this example there is a field in the data file that contains a date in the format YYYY-MM-DD. This needs to be transformed to simply the Year. While a LEFT or YEAR function could be used in a formula, there are multiple ways to address this transformation requirement using core functionality.
The column definition is an easy way to cast a date field and is often overlooked. In this example, setting the column Data Type to DATE and then setting the Format to yyyy would cast the date in the field into a Year format.
While the column definition is one way to achieve the desired result, it has a limitation in that the original value in the data file is cast into the format (year) and the additional detail is lost. Let’s assume that while we wanted one column to be the year, we also wanted another column to be the month. In this instance, using the Split Field Value transformation would be a better option as it allows us to parse out the Year and Month values and create two distinct columns.
Scenario 2: Combining Fields
In this example we have a data file where each segment of the chart of accounts is a unique field (column) in the file. We need to combine the account and subaccount together for our reporting purposes. We could use the CONCATENATE function to achieve this; however, the Merge Columns transformation is specifically intended for this type of operation.
If we wanted to add a prefix to the merged result then the Formula transformation using the CONCATENATE function is a better option as it would reduce the number of transformations that would need to be defined.
Scenario 3: Mapping Data
In this example we received a data extract that contains Actual, Budget, and Forecast data. One of the fields has an indicator of the data where A is actual, B is Budget, and F is Forecast. While a set of nested IF or REPLACE functions could perform this transformation, a mapping group is always the best option when needing to transform data based on a functional definition such as this. We cover functional transformation in great detail in a future post.
Scenario 4: Scaled Data
Returning to our example from the previous scenario, Actual data are in whole dollars but Budget and Forecast data are scaled in thousands. We need to convert Budget and Forecast to whole dollars to make variance reporting more efficient. This is a perfect application of the Formula transformation. Applying an IF function allows us to easily transform the data.

Group By
The Group By transformation is useful when data needs to be aggregated. The Group By transformation functions just like a SQL statement Group By. The primary decision point to determine if a Group By (aggregation) operation should be performed by a Data Prep transformation or by a Wdata query is whether there will be a need for the granular detail that comprises the aggregated value.
If there is a need to be able to drill down and see the detail behind an aggregated value then it is recommended to use a Wdata query with a Group By statement. If the detail is not needed then the Data Prep Group By transformation is a simple way to collapse detail and provide aggregate data.
Split Field Value
The Split Field Value transformation streamlines parsing of information from a field that contains a concatenated value. Let’s consider a delimited chart of accounts that contains four fields, legal entity, natural account, cost center, and profit center, separated by a dash. This may look something like
1200-600100-24072-12000
Using the Split Field Value, we can break this string apart and create columns for each segment. Additionally, we can selectively retain certain fields from the split by limiting the number of fields we define in the New Fields Name section of the transformation. For example, if we set the Select Type to First and then only specified three New Field Names, the Profit Center value would not be created as a new column.

Unfortunately you cannot specify exactly which field values to use to create new columns. For example, you could not configure the transformation to return only the Natural Account and Profit Center values. To do so, you would need to use the Split Field Value transformation and add a Remove Columns after to remove the Legal Entity and Cost Center columns.
String Distance
The String Distance transformation was created specifically to serve GRC data needs. The transformation performs a Levenshtein distance calculation which calculates the minimum number of single-character edits needed to change one string into another. For example, if we were to apply the String Distance transformation to the following strings:
1IBM
2I.B.M.
The transformation would return a value of 3 since three periods would need to be removed/added to make the two values match each other.
This transformation is very useful to identify strings that may be similar. This is common across master data such as vendors or customers as much of this data is entered by hand and subject to human error. Using the transformation allows us to identify field values that may be different but close to an official item thereby allowing us to review and correct potential discrepancies.
Keep Columns / Remove Columns / Rename
While these transformations can be used successfully, we like to remind users that these actions can usually be performed in the Column Definition and Run Pipeline Command configuration. We outlined a scenario in the Split Field Values section above where Remove Columns is needed but we encourage the use of the column definition to perform these types of transformations whenever possible.
Smart Filter
The Smart Filter transformation is used to conditionally keep or remove rows in the data based on one or more criteria specified. While the other Filter transformations work as intended, we find the Smart Filter to be the most flexible and recommend its use almost exclusively for this type of filtering operation.
Smart Filter can be used in conjunction with mapping groups as a way to remove rows based on functional criteria that may be complex or evaluated based on multiple fields. This method is outlined in this knowledge article on the Workiva Help Center.
Sum If
The Sum If transformation is a bit of a misnomer. Sum If does not technically change the data technically or functionally. Sum If creates an additional Output when applied to a Pipeline. The transformation will aggregate all of the data values in a field based on the criteria defined in the Sum If transformation configuration. The resulting value is available as an Output of the Pipeline which can be used in subsequent Chain actions.
One common functional use of Sum If is to generate checksums which can be used to ensure data completeness has not been compromised after the transformations have been applied. While the Sum If value is not able to control any Pipeline actions, Conditionals and branching logic in Chains can be used to stop additional processing and prevent incomplete or inaccurate data from being loaded to a downstream system such as a Wdata table.
Data Prep Limitations
We have outlined a number of the powerful technical transformations that Data Prep can perform. We feel that it is important to make note of a key limitation. Data Prep transformations evaluate each line of data individually. This means that the transformations cannot look at the values in another row in order to transform the row on which it is processing.
One example can be to evaluate if any row in the data contains a specific value, then apply a transformation. This is not possible. Another example would be aggregating the values of multiple rows and then using that value as part of a transformation. This is a common requirement for allocations where the value of an individual record needs to be divided by the total of all records that meet a certain criterion to calculate the percentage of total. This cannot be done in a single Pipeline. While there are ways to creatively achieve this through the use of multiple Pipelines, we caution against this as Data Prep is a transformation application, not a calculation engine.
Summary
In this article we defined technical transformation and reviewed several of the most common technical transformations used in Data Prep Pipelines. We also outlined key limitations to help you make better decisions about how to use Data Prep in your data management processes. We hope that you see the potential for Data Prep to simplify your data integration processes and empower citizen developers to define easy to understand, powerful transformation procedures.
A future article will explore the functional transformation capabilities of Data Prep which, when combined with the technical transformation capabilities, offers a rich data management solution.