Looping Through Data


I don't know about you, but I often organize students with google sheets. I use rows for each student and columns for information related to the student. Sometimes I need to push this informatiuon back to students or parents. Sometimes I have do something else with the information. It’s helpful to know how to get the data and work with it in a loop.

Spreadsheets are a two dimensional arrays. There is information in Rows, dimension 1,  and information in columns, dimension 2. We can target each cell by knowing the proper row and column.

We can also loop through each row and do something with the data. 

Computers start counting from 0. Google Sheets starts counting the informational rows and columns from 1. I will also use my own label in row 1 to identify the column number. This way I can insert columns and move them around and still retrieve the same information. Let’s start with identifying the columns by our own labels and setting the column number to a variable which makes sense to us. 

function loopThroughData(){

  var ss= SpreadsheetApp.getActiveSpreadsheet();
  var sheet= SpreadsheetApp.getActiveSheet();
  var emailcol = getColumnByName(sheet, "Email");
  Logger.log(emailcol);
}

function getColumnByName(sheet, name) {
  var range = sheet.getRange(1, 1, 1, sheet.getMaxColumns());
  var values = range.getValues();
  for (var row in values) {
    for (var col in values[row]) {
      if (values[row][col] == name) {
        return parseInt(col)+1;
      }
    }
  }
}

A function called getColumnByName is used in our loopThroughData function. It’s used with two variables pushed into it. The first is the sheet we are working on and the second is the column name. This function runs and will return the column number if the column name is found. The program returns to the loopThroughData function and completes the script Logging the column number. 
This begins to give us some flexibility with the sheet. We can add and move columns and not work about breaking our program. We can run it and we’ll see the column number returned in the log. The intent here is to be able to loop through and use data in a full sheet. We know how to get the column numbers but what about the rows and the rest of the data. 

Efficiency in code so far doesn’t seem like a big deal. We’ve looked at very little data. But if we were to use an entire sheet composed of the maximum 5,000,000 cells (as of the newest 2018 update) we need to start being aware of how long each step of code takes. Even if we consider using up to 1,000,000 cells it can take longer than we expect to finish our script. Attendance data can exist in large data sets. Consider a school of 400 students * 180 school days * 7 classes each of attendance. 504,000 cells of data. If a step of our program takes .01 seconds it will take 5,040 seconds to complete or 84 minutes or 1 hour and 24 seconds. And that’s just one step. So while we can pull each cell individually with sheet.getRange(row, column).getValue() we need to ask ourselves if it’s the most efficient. 

Our loop will instead pull all the data, once, and reference it in the computer memory. Below I concentrate on our main function our getColumnByName should remain in the script. I usually keep functions I use in larger functions at the bottom on all my scripts. 

function loopThroughData(){
  var ss= SpreadsheetApp.getActiveSpreadsheet();
  var sheet= SpreadsheetApp.getActiveSheet();
  var emailcol = getColumnByName(sheet, "Email");
  var data = sheet.getDataRange().getValues();
  for (var i =1;i<data.length;i++){
    Logger.log(data[i][emailcol-1]);
  }
}

The variable data is the used area of the sheet. It provides data in a matrix format. The first referenced data index is the row. The second is the column. We would use a reference to the index number of the row and column to reference a value. The index starts from 0 so row 0 is our header. In the sheet it’s row 1.Our column   numbers compared to the sheet are all offset by -1. To return the header cell containing “Email” we can use row 0 and column emailcol-1. We’d write it like this in our script.

data[0][emailcol-1]

The square brackets [] are used to indicate an index number. They can be strung together to reference indexes of indexes.

Now that we have a concept of indexes we are going to use a for loop to walk through our rows of data. For loops have three inputs to get them started. First a variable to count with set to the first number; var i = 0. Then the end of the loop as an expression. I usually think “while” in my head as I write it; i<data.length. And finally an increment for our counting variable to increase by with each loop; i++. JavaScript allows the short hand of using ++ to increment by 1 rather than i=i+1.

Here’s our initial for loop. 
for (var i =1;i<data.length;i++)

The loop needs to know what to do each time around. We put an action inside curly brackets { and }. We can also use the i variable in our action. 

data[i][emailcol-1]

With data we are saying return the second row (computers start from 0 and i starts being set as 1 se we are skipping our header row) and the column of the email (we assigned emailcol as the sheet column which is now offset in data by -1 so we offset the email column number by -1).

Putting it all together we can loop through the data and do something with it. Above we are returning whatever is in the email column and putting it in the log. We can instead send emails, create slides, make websites or many other operations which we would do manually but take a lot of time. Here's a sample sheet that sends out emails. 
Sample

Enjoy!

No comments:

Post a Comment

Social Emotional Well-Being During Online Teaching and Learning

The world is a bit nutty right now because of the COVID-19 pandemic. Schools are switching to online learning and parents are working from...