Preparing your data for use in Distance, August 2015

Introduction

One of the goals of the workshop is to enable people who have already collected distance sampling data to do some preliminary analysis of their data using the computer program Distance. This page explains how to get your data into a format that Distance can easily read. If you can bring your data to the workshop in this format then we will be able to import it quicker, and so you will have more time to play with the analysis.

If you are already using Distance 6 (the current versions) then feel free to bring along your data and analyses as a Distance project file.

If you do not have data to bring to the workshop – we will be providing plenty of informative exercises to keep you busy during the computer sessions. There may well be someone with similar interests to you who has brought data along, and you could discuss the analysis together.

Getting Data into Distance

One of the goals of the workshop is to enable people who have already collected distance sampling data to do some preliminary analysis of their data using the computer program Distance. This page explains how to get your data into a format that Distance can easily read. If you can bring your data to the workshop in this format then we will be able to import it quicker, and so you will have more time to play with the analysis.

If you are already using Distance 6 (the current versions) then feel free to bring along your data and analyses as a Distance project file.

If you do not have data to bring to the workshop – we will be providing plenty of informative exercises to keep you busy during the computer sessions. There may well be someone with similar interests to you who has brought data along, and you could discuss the analysis together.

Stratum 1;100;Line 1;10;14
Stratum 1;100;Line 1;10;8
Stratum 1;100;Line 1;10;22
Stratum 1;100;Line 2;10.3;7
Stratum 1;100;Line 2;10.3;37
Stratum 1;100;Line 2;10.3;13
Stratum 2;123;Line 1;5.7;
Stratum 2;123;Line 2;8.4;27
Stratum 2;123;Line 2;8.4;76
Stratum 2;123;Line 2;8.4;44
Stratum 2;123;Line 2;8.4;7

In this file, the columns are separated by semicolons. Column 1 is the stratum name, column 2 is the stratum area, column 3 is the transect name, column 4 is the transect length and column 5 is the perpendicular distance. Notice that all transects from the same stratum are grouped together on adjacent lines, and all observations from the same transect are grouped together (this is important: sort your data by transects within strata). Notice also that the record “Line 1″ in “Stratum 2″ has no distance in the final column – this is a transect where no objects were seen.

There is a narrated video that describes the sequences of instructions you provide to Distance to bring these data into Distance for analysis. The video is 8 minutes in length and provides you with requisite information to import simple data structures into Distance.

Which columns should you include in your data file? As a minimum, your file should contain a column for transect or point name and a column for observed distance. For line transect surveys you will also need a column for transect length. If your survey involved stratification then you will need to include columns for stratum name and stratum area. If you measured radial distance and angle then you will need a column for angle, and if your objects are clusters, rather than individuals, then you should include a column for cluster size. Your data file will contain somewhere between 2 and 7 columns, depending on the type of survey.

The columns should be separated by a delimiter (ASCII character), which can be either a tab, semicolon, comma or space. The order of the columns is not important, as you tell Distance which column is which during the import process. Each row should finish in a Carriage-return + Line-feed combination. This is the default end-of-line indicator used by most Windows-based applications.

Data collected in intervals (bins)

In some distance sampling surveys, the exact distances to the observations are not recorded. Instead, observations are placed in pre-defined intervals, or bins. For example, in a point transect survey of songbirds one could define intervals of 0-50 metres, 50-100, and 100-200. To enter this type of data into Distance, enter each observation at the mid-point of the interval. If there were 2 birds seen in 0-50, 3 at 50-100 and 3 at 100-200 on point 1, then the data file would look like this:

Point1,25
Point1,25
Point1,75
Point1,75
Point1,75
Point1,150
Point1,150
Point1,150

(In this example we are pretending that there are no strata, so there is no stratum name or stratum area column. Because it is a point transect example, there is no transect length column. We are using a comma as delimiter.)

Additional information

The description above is for the most basic data structure to be imported. Distance is capable of importing additional columns of data that you may wish to use in your analysis (such as year of survey in multi-year surveys). This will be covered during the workshop. To keep maximum flexibility, it is best to bring your data along as a text file, as outlined above, but also in its original spreadsheet or database format in case you decide to take on a more complex analysis in a later part of the workshop.

With software such as a spreadsheet, it is easy to arrange, sort and filter columns and export them into a text file that Distance can read. Alternatively, if you are bringing along your own computer then you can use your favourite software to do the required re-formatting.

Conclusion

If you are still confused about what to do, don’t worry – we will be on hand to help when you get to the workshop. Just bring some of your data along in an electronic format and we should be able to get some of it into Distance for you to analyse.