RSS
 

#17: Reformat data in Excel (sheet A to Array x to Array y to sheet B)

25 Jul



This post is an example for you to practice post#9 and post#16. The 2 images on the right hand side are illustrations to show what I am tyring to do in this post:

In the excel file “MeasurementTab”, I created 2 sheets, one is called “Raw” and the other is called “ModeCategoryUnit”. I copied all the protocol tables from our Software Specifications, and paste them one by one to the “Raw” sheet. There are less than 50 Protocols, so it is okay to copy them manually. However, each Protocol table contains 2 to 20 measurements, so there are about 500 measurements in-total, and I want to rearrange all these measurements in another format/order and save them in the “ModeCategoryUnit” sheet, which will take days if I do it manually.

I decide to automate the data process. I will export the data from “Raw” sheet to an array called Raw(). By using the array Raw() and another small array ModeArray(1 To 9), I will rearrange the data in a new order, and save the data to an array called ModeCategoryUnit(). Finally, data in array ModeCategoryUnit() will be exported to sheet “ModeCategoryUnit” in the Excel file.

'Convert "Raw" sheet (populated directed from software specification - word version) to "ModeCategoryUnit" sheet in "Measurement.xlsx"
'If handle all the Packages/Protocols/Measurments, then need to increase ModeCategoryUnit's size and size parameters in Function_SaveArrayAsExcelSheet

'$TPinclude "Declaration_GlobalConstants"

Sub Main()

'Store Data from the "Raw" Sheet to array Raw()
Dim RowCount As Integer
Dim ColumnCount As Integer
Dim Raw() As String

RowCount = Function_ExcelSheetRowCount.ExcelSheetRowCount(ExcelFilePath_Measurement, "Raw")
ColumnCount = Function_ExcelSheetColumnCount.ExcelSheetColumnCount(ExcelFilePath_Measurement, "Raw")

ReDim Raw(1 To RowCount, 1 To ColumnCount) As String
Raw() = Function_GetExcelSheetData.GetExcelSheetData(ExcelFilePath_Measurement, "Raw")

'Delete extra space (if any) for each cell of array Raw()
Dim TrimRaw() As String
ReDim TrimRaw(1 To RowCount, 1 To ColumnCount) As String
Dim i As Integer    'row number of array Raw()
Dim j As Integer    'column number of array Raw()

For j = 1 To ColumnCount
    For i = 1 To RowCount
    TrimRaw(i, j) = Trim(Raw(i, j))
    Next
Next

'Convert data from array TrimRaw() to array ModeCategoryUnit()
Dim ModeCategoryUnit(1 To 15, 1 To 60) As String    'Size should be increased if handle all the Packages/Protocols/Measurments

'First, write the header section (top 3 rows) of array ModeCategoryUnit()
'Including Package (1st row), Protocol (2nd row) and Mode (3rd row)
Dim Package As String
Dim Protocol As String
Dim ModeArray(1 To 9) As String
Dim ModeList As String
Dim Counter As Integer
Dim Column As Integer

Column = 1
For j = 1 To ColumnCount
    If TrimRaw(1, j) <> "" Then
        Package = TrimRaw(1, j)
        If TrimRaw(2, j) <> "" Then
            Protocol = TrimRaw(2, j)
            'Reset Counter, Clearn ModeList and ModeArray for the FOR loop
            Counter = 1
            ModeList = ""
            Erase ModeArray
            For i = 3 To RowCount
                If TrimRaw(i, j + 3) <> "" Then
                    'If TrimRaw(i, j + 3) doesn't exist in ModeList
                    If InStr(ModeList, TrimRaw(i, j + 3)) = 0 Then
                        ModeArray(Counter) = TrimRaw(i, j + 3)
                        'ModeList looks like "ModeArray(Counter),ModeArray(Counter+1),,,,,,,,"
                        ModeList = Join(ModeArray, ", ")
                        Counter = Counter + 1
                    End If
                Else: Exit For
                End If
            Next
            'Reset Counter for the DO loop
            Counter = 1
            Do While Counter <= UBound(ModeArray) And ModeArray(Counter) <> ""
                ModeCategoryUnit(1, Column) = Package
                ModeCategoryUnit(2, Column) = Protocol
                ModeCategoryUnit(3, Column) = ModeArray(Counter)
                Counter = Counter + 1
                Column = Column + 3
            Loop
        End If
    End If
Next

'Second, write the rest section of array ModeCategoryUnit()
'including Measurement, and its Category and Unit (same row)
Dim Measurement As String
Dim Category As String
Dim Unit As String
Dim Mode As String
Dim k As Integer    'row number of array ModeCategoryUnit()
Dim l As Integer    'column number of array ModeCategoryUnit()

For j = 1 To ColumnCount
    If TrimRaw(1, j) <> "" Then
        Package = TrimRaw(1, j)
        If TrimRaw(2, j) <> "" Then
            Protocol = TrimRaw(2, j)
            For i = 3 To RowCount
                If TrimRaw(i, j) <> "" Then
                    Measurement = TrimRaw(i, j)
                    Unit = TrimRaw(i, j + 1)
                    Catergory = TrimRaw(i, j + 2)
                    Mode = TrimRaw(i, j + 3)
                    'Search all the columns of array ModeCategoryUnit()
                    For l = LBound(ModeCategoryUnit, 2) To UBound(ModeCategoryUnit, 2)
                        If ModeCategoryUnit(1, l) = Package And ModeCategoryUnit(2, l) = Protocol And ModeCategoryUnit(3, l) = Mode Then
                            For k = 4 To UBound(ModeCategoryUnit)
                                If ModeCategoryUnit(k, l) = "" Then
                                    ModeCategoryUnit(k, l) = Measurement
                                    ModeCategoryUnit(k, l + 1) = Catergory
                                    ModeCategoryUnit(k, l + 2) = Unit
                                    Exit For
                                End If
                            Next
                            Exit For
                        End If
                    Next
                Else
                    If TrimRaw(i, j + 3) <> "" Then
                        Mode = TrimRaw(i, j + 3)
                        For l = LBound(ModeCategoryUnit, 2) To UBound(ModeCategoryUnit, 2)
                            If ModeCategoryUnit(1, l) = Package And ModeCategoryUnit(2, l) = Protocol And ModeCategoryUnit(3, l) = Mode Then
                                For k = 4 To UBound(ModeCategoryUnit)
                                    If ModeCategoryUnit(k, l) = "" Then
                                        ModeCategoryUnit(k, l) = Measurement
                                        ModeCategoryUnit(k, l + 1) = Catergory
                                        ModeCategoryUnit(k, l + 2) = Unit
                                        Exit For
                                    End If
                                Next
                                Exit For
                            End If
                        Next
                    Else: Exit For
                    End If
                End If
            Next
        End If
    End If
Next

'Use shared module "Function_SaveArrayAsExcelSheet" to write data from array to Excel
a = Function_SaveArrayAsExcelSheet.SaveArrayAsExcelSheet(ModeCategoryUnit, 15, 60, ExcelFilePath_Measurement, "ModeCategoryUnit")

'Erase arrays
Erase ModeCategoryUnit
Erase TrimRaw
Erase Raw

End Sub

From the above script, we practiced the following:
1, Use a module “”Declaration_GlobalConstants”" to define a globe constants “ExcelFilePath_MeasurementTab”. (for more details see post#10)
2, Call Function_GetExcelSheetData to transfer data from the “Raw” sheet to array Raw(). (for more details see post#9)
3, Use Trim function to remove spaces on both sides of a string.
4, Join array elements to a string (ModeList = Join(ModeArray, “, “))
5, Call Function_SaveArrayAsExcelSheet to transfer data from array ModeCategoryUnit() to “ModeCategoryUnit” sheet. (for more details see post#16)
6, Erase array (e.g., Erase Raw)

There are an potential improvement for the above script. Currently, I hard coded the size of array “ModeCategoryUnit”, so when there are more measurements added, then I need to change the size in the script. However, there is a solution: “Adding New Elements to array on the Fly” . I didn’t do this because that the logic of the script is already complected, and I do not want to confuse you, but you can try this out by yourself!

 
1 Comment

Posted by Jia Qi in TestPartner (TP)

 

Tags: , , , , , , , , , ,

Leave a Reply

 
 
  1. Bin

    July 29, 2010 at 10:42 pm

    The header written part should be combined with the mode categorized part.

    In the original raw data sheet, the package/protocol combinations are unique for every 4 columns. In each of the package/protocol combination, there are various modes. And importantly, the combinations are not repetitive, therefore we can sequentially.

    After breaking down the raw data into 4 columns, we can now focus on the 4 column unit. The package and protocol are constants, the only varying variable now is the “mode”. We can loop through all the mode/measurement/category/unit rows and categorize them into modes instead of the original measurement/category/unit.

    After get the new data sorted by the mode, we can just easily add the current package/protocol, and store them in an array, which will be assembled for all of the every 4 columns and finally put into the new file.