Collibra DGC does not accept comma-separated values within one cell, what should I do?

Multiple values are many times separated by commas from a source system export. This most likely results from  many values sharing a relationship to one value. Collibra DGC cannot distinguish that the cell contains multiple terms. The best solution is to avoid these comma separated values if possible. Otherwise the solution is to separate the comma separated values into cells and then transpose the data. For example:

Comma Separated Version

 

 

Transposed Version

 

 

Excel has a Text to Columns feature to split these values into multiple cells.

  1. First, select the cells containing comma separated values and click the Text to Columns button
  2. Choose the Delimited option and select Next
  3. Preview the output and select Next
  4. Approve and select Finish. Note: If values are separated by a comma and space, there will be a leading space in the resulting cell. Leading and trailing spaces can be removed with the Trim function.
  5. Each language now has its own column. Manually name them Language 1, Language 2, Language 3…
  6. The best way to transpose this format in Excel is to manually copy and paste each row and column. This technique may be insufficient for larger tables. A work around could be to break the data down into several smaller tables. Ideally, a more powerful data transformation tool would be used in these situations.
  7. To complete the process, just clean up the data by removing blank rows and headers

You have to login to comment.