Tuesday, July 7, 2009

Sample Data

Here is a tab-delimited text sample output of a day or two of data from my GoWearFit branded BodyMedia armband (device was not worn much except at night).

You can import this data into a spreadsheet. Note there are multiple structures dumped.

Note that the column headers do NOT LINE UP with the data columns, I am just dumping one bytes per column - some values obviously are 16-bit or more. If you solve this little puzzle, please let me know!

This is the output of the SaveStructTabDelim() function in my Python library for talking with the device. See the code for detail.

http://www.filedropper.com/z705a-full-serial

Contact me to get a cPickle capture of the raw packets, which contain some transactions not representing in the processed tab-delimited form.

I'm very curious what you all make of it.

4 comments:

  1. Nice work!

    I think you are correct after all that it only logs things in one-minute epochs, based on the data dump over at the Bodybugg Hacks blog:
    http://bodybugghacks.blogspot.com/2007/10/different-view-of-same-raw-data.html

    In order to do this, it needs summary statistics for each epoch. The values with the MOV- prefix are probably some kind of mean (maybe just "Mean Observed Value"?), and I'm fairly sure that the MAD- prefix refers to median absolute deviation (a variance measure).

    First the obvious stuff which I'm sure you know already:
    ACCFW, ACCTR, and ACCLO refer to acceleration on foward, transverse, and longitudinal axes.
    TSKIN is skin temperature.
    GSR is galvanic skin response.
    EE might be heat flux (see below).
    TCOV, VBAT, and ON are mysteries to me...Where are the values from the heat flux sensor?

    The oldest Bodymedia devices had heart rate inputs, and they have some white papers talking about developing their own heart rate monitor (something that's infrared so it doesn't need a chest strap), so HRATE is probably just a placeholder for this, but there shouldn't be any data there (maybe it's all zeroes?). Perhaps they also have placeholders for ECG data, and that's what the MADECG is for (though it would be weird to have only a single channel of ECG data). PEDO3 may also be an empty slot for data from a pedometer, or it could be derived data from the accelerometers that actually counts steps. Based on the Bodybugg Hacks data dump, it looks like just a placeholder.

    I was looking at some papers on actigraphy (using accelerometers to measure sleep efficiency), and there are a few standard summary statistics used to turn an epoch's worth of raw data into a single value. One is to count the number of zero crossings on each axis, so perhaps that's what F0CROSS is, though I'm not sure why there wouldn't also be zero crossing counts for the transverse and longitudinal axis. Maybe FCOUNT, TCOUNT, and LCOUNT are actually the zero crossing counts, and who knows what F0CROSS is. Another way to summarize epochs is to integrate and get the area under the acceleration curve for each axis, but the zero-crossing method is basically just as accurate, so in a simple device like this, I'd expect them to track zero crossings. But there is also some kind of peak value measurement for each axis (FWPEAKS and TRPEAKS--but where is the longitudinal axis?). THETA is often used to represent an angle of rotation, so maybe this (MOVTHETA) is also some kind of derived value from the accelerometers to indicate a change in orientation?

    A month or two ago I read a bunch of the white papers and journal articles that the Bodymedia people have published, and I feel like at that time I had an idea of what PLATEAU referred to, but I've forgotten.

    I take it you haven't yet found anything that looks like an epoch value? Based on the Bodybugg Hacks raw data, the older device was just using standard Unix epochs, so you might try searching for values from around the time you collected the data, but I'm sure you've tried this already.

    (to be continued...)

    ReplyDelete
  2. Finally, note that the raw data over at the Bodybugg Hacks site (http://bodybugghacks.blogspot.com/2007/10/more-progress.html) comes in almost the same order as what you've gotten here--is it possible that you've missed some fields, or do you think the protocol has changed? As far as I know, the only sensor change from that model to the device you have is the move from a 2-axis to a 3-axis accelerometer, and if you replace the heat flux (RAWHF) with the forward axis of the accelerometer (RAWACCFW) and assume that EE is mean heat flux (MOVHF), the first 18 values are the same. The 19th parameter parameter (SPARE18) used to be blank, so it makes sense that they've added one of the forward-axis accelerometer parameters there (F0CROSS). Here's the original parameter list from the above link:

    Time,RAWHF,RAWTSKIN,RAWGSR,RAWACCTR,RAWACCLO,
    RAWVBAT,RAWTCOV,RAWON,MOVHF,MOVTSKIN,MOVGSR,
    MOVACCTR,MOVACCLO,MOVVBAT,MOVTCOV,MOVON,
    MADACCTR,MADACCLO,SPARE18,HRATE,PEDO3,PLATEAU,
    MADECG,TRPEAKS,MOVTHETA,MADACCV,MOVACCV,MADHF,
    RAWACCV,TCOUNT,LCOUNT,PEDO3TOE,TIMESTMP,
    HEARTBT,T0CROSS,L0CROSS,RAWECG,LOGSWEEP,
    MADTHETA,LOPEAKS,COMPGSR,RAWCGSR

    Hope some of this is helpful!

    ReplyDelete
  3. Kenneth,
    Your analysis is very helpful. I hope you keep it up, and ask me for any data you want.

    I've been fully devoted to decoding the protocol and data structures so far, and have had almost no time for the data analysis (and may not even have time in the near future to do so, unfortunately).

    The "BodyBuggs Hacks" blog describes an entirely different clear-text protocol from the fully binary packetized communications I found.

    A lot must have changed between versions.

    As you say, most, but not all, field names are the same. I haven't missed fields - I pull the names neatly numbered 1-30 right out of the data structure. I also just now checked my sniffed intercepts from the official software and "RAWHF" is definitely never mentioned.

    One thing that was the same is Henry displays a "Record type" 16 and 17. In my data, there are what appear to be type ID fields of 16 and 17 (and others) within each table row. But the field names don't easily mesh.

    I had done some quick searches for most significant digits of Unix epoch timestamps in the raw data, but didn't find any on first try.

    I strongly suspect that each new data table is started when the armband is put on. Each table has a header which I have not decoded. I suspect that table header structure may contain some sort of start-time timestamp.

    The first two fields of each table row contain what is almost certainly a 16-bit integer with very "incremental" properties. I speculate it could be time related, although it is not monotonically increasing. Maybe a "time since last record"? Could just be a temp or some steady measure.

    But enough speculation from me... :)

    A clean full 24-hrs of data tomorrow.

    ReplyDelete
  4. OV = minimum output variance ???

    Bunch of google hits using the "MOV" = mimimum output variance abbreviation in signal processing.

    Here is a 2004 paper in the domain of active noise and vibration control to make algorithms more robust when sensor saturation may occur. I don't totally grok it from the abstract, but it seams relevant.

    http://www3.interscience.wiley.com/journal/109746344/abstract?CRETRY=1&SRETRY=0

    Note, I posted a full 24-hrs of data to the blog yesterday.

    ReplyDelete