I’ve been using the Pedometer++ app since 2015 to track my daily steps and overall I’m very pleased with how it work and looks but a little while ago I noticed an issue: the number of floors I’ve supposedly climbed is wildly off in the app.
A daily best of 12,403 would be the equivalent of climbing about five Mount Everests in a day. Exporting the data from the app and graphing it shows the problem in more detail:
Something has obviously gone wrong with the floors data in 2015 and a quick email to Pedometer++ customer support confirms this:
Several years ago changes were made to the step database when new features were introduced in the app.
This created the problem where your data prior to November 27 2015 was corrupted.
Sorry but we cannot fix that problem now.
While they claimed that the data was unfixable, I decided not to take their word for it. The Pedometer++ app allows you to export the data from the app as a CSV or as a custom Export.steps
file that can be re-imported back into the app.
Running the file
command on this backup file reveals that it’s LZFSE compressed:
» file Export.steps
Export.steps: lzfse compressed, compressed tables
LZFSE is an Apple designed comporession algorithm that can be comporessed or decompressed with the compression_tool
command on macOS:
compression_tool -decode -i Export.steps
Doings so reveals that Export.steps is just JSON encoded data:
{
"StepSample" : [
{
"intervalStart" : 634993200,
"stepCount" : 0,
"floorsAscended" : 0,
"distanceMeters" : 0
},
{
"intervalStart" : 634989600,
"stepCount" : 24,
"floorsAscended" : 0,
"distanceMeters" : 20
}
// ...
],
"PushSample" : [/* ... */],
"StepCount" : [
{
"intervalStart" : 468194400,
"stepCount" : 3827,
"floorsAscended" : 3458, // ← distanceMeters has overwritten the floors field
"distanceMeters" : 3458
},
{
"intervalStart" : 468108000,
"stepCount" : 7673,
"floorsAscended" : 6927, // ← distanceMeters
"distanceMeters" : 6927
},
// ...
],
"GoalPoint" : [/* ... */]
}
StepSample
contains step, distance, and floors data in 15 minute intervals. This is the new data format used since November 2015PushSample
contains the equivalent data for wheelchair usersStepCount
contains step, distance, and floors data in 15 minute intervals in the older format, in 1 day intervals.GoalPoint
stores your daily step goals over time
For all of these, the intervalStart
field is the start of the interval expressed as an Apple Cocoa Core Data timestamp.
Luckily for me, I only need to touch the older data stored under the StepCount
field which greatly simplifies things.
My first idea was to set the number of floors to zero for any day with more than, say, 100 floors ascended. However, looking at the data in the Health app on iOS revealed that the flights climbed numbers there looked perfectly reasonable for 2015.
The Health app lets you export all your health data as a zipped XML file. The file size for the XML file was almost 2GB once unzipped, so I chose to use Nokogiri’s SAX parser to process it.
Beware, the code below isn’t particularly pretty but it does the job:
#!/usr/bin/env ruby
# frozen_string_literal: true
require "bundler/inline"
require "time"
require "set"
require "open3"
require "json"
gemfile do
source "https://rubygems.org"
gem "nokogiri"
end
# <Record
# type="HKQuantityTypeIdentifierFlightsClimbed"
# sourceName="Mercury"
# unit="count"
# creationDate="2014-11-28 18:03:26 +0200"
# startDate="2014-11-28 17:54:52 +0200"
# endDate="2014-11-28 17:54:54 +0200"
# value="2" />
class FloorsParser < Nokogiri::XML::SAX::Document
HKQuantityTypeIdentifierFlightsClimbed = "HKQuantityTypeIdentifierFlightsClimbed"
OUTPUT_COUNT = 10_000
attr_reader :start_date_filter, :end_date_filter, :start_time_filter,
:end_time_filter, :allowed_sources, :found_sources, :data
def initialize(start_date:, end_date:, allowed_sources: [])
@start_date_filter = start_date
@end_date_filter = end_date
@start_time_filter = start_date.to_time.freeze
@end_time_filter = Time.new(
end_date.year, end_date.month, end_date.day,
23, 59, 59
).freeze
@allowed_sources = allowed_sources
end
def start_document
puts "Starting document processing... (each dot represents #{OUTPUT_COUNT} records)"
@found_sources = Set.new
@data ||= {}
@processed = 0
end
def start_element(name, attrs = [])
case name
when "Record"
@record = attrs.to_h
@processed += 1
print "." if (@processed % OUTPUT_COUNT) == 0
end
end
def end_element(name)
case name
when "Record"
if @record && @record.fetch("type") == HKQuantityTypeIdentifierFlightsClimbed
@found_sources.add(@record["sourceName"])
if allowed_sources.include?(@record["sourceName"])
start_time = Time.parse(@record["startDate"])
end_time = Time.parse(@record["endDate"])
return unless start_time >= start_time_filter && end_time <= end_time_filter
date = start_time.to_date
if date != end_time.to_date
puts "\nDay crossing record! #{date.iso8601}"
puts " (#{(end_time - start_time).to_i}s) [#{@record["sourceName"]}]\n"
end
@data[date] ||= {}
@data[date][@record["sourceName"]] ||= 0
@data[date][@record["sourceName"]] += @record["value"].to_i
end
end
@record = nil
end
end
def end_document
puts "\n---\n"
puts "Date range: #{start_date_filter.iso8601} → #{end_date_filter.iso8601}"
puts "Found sources: #{@found_sources.to_a.join(", ")}"
puts "Allowed sources: #{allowed_sources.join(", ")}"
puts "Data count: #{@data.keys.length}"
end
end
# Create our parser
floors_parser = FloorsParser.new(
start_date: Date.new(2014, 1, 1),
end_date: Date.new(2015, 12, 31),
allowed_sources: ["Mercury"]
)
parser = Nokogiri::XML::SAX::Parser.new(floors_parser)
# Send some XML to the parser
parser.parse(File.open("./export.xml"))
data = floors_parser.data
puts "Decompressing steps data"
decoded, stderr, status = Open3.capture3("compression_tool -decode -i ./Export.steps")
if status.success?
steps = JSON.parse(decoded)
puts "Fixing floors ascended data"
steps["StepCount"].each do |item|
# Parse the Core Data timestamp
time = Time.strptime((item["intervalStart"] + 978307200).to_s, "%s")
# item["intervalStart"] = time # for debugging
if (flights = data.dig(time.to_date, "Mercury"))
item["floorsAscended"] = flights
else
item["floorsAscended"] = nil
end
end
puts "Replacing missing floors ascended data with the daily average"
values = steps["StepCount"].map{|i| i["floorsAscended"]}.compact
average = (values.sum.to_f / values.length).round
steps["StepCount"].each do |item|
item["floorsAscended"] = average if item["floorsAscended"].nil?
end
puts "Saving the fixed data"
filename = "Export-fixed-#{Time.now.to_i}.steps"
Open3.popen3("compression_tool -encode -o ./#{filename}") { |stdin, stdout, stderr, wait_thr|
stdin.write JSON.pretty_generate(steps, space_before: " ")
stdin.close
exit_status = wait_thr.value
if exit_status.success?
puts "Success! Generated #{filename}"
else
exit exit_status.to_i
end
}
else
puts "Decoding steps failed"
puts stderr
exit status.to_i
end
The process goes roughly like this:
- Parse out the data from the XML file exported from Apple Health
- In my case I only cared about data from 2015 and from just one source: my iPhone from the time (named ‘Mercury’)
- The export has the flighs ascended data in < 1 day intervals, so combine those into a single value per day
- Plus some code to guard against edge cases, like intervals crossing day boundaries
- Decompress and parse the steps data exported from Pedometer++
- Iterate over the
StepCount
data and repalce thefloorsAscended
number with the value from the parsed Apple Health data- If there wasn’t any data for the given day, use the average from that days with data
- There weren’t all that many days with no data, so I could really have just set the floors value for those days to zero…
- Generate JSON for the fixed data and compress it with the
compression_tool
command
Parsing the XML data is the longest step and it takes about 80 seconds to run.
To import the fixed data, I deleted and reinstalled the Pedometer++ app and then opened the fixed .steps
file from an email that I sent to myself. I first tried uploading the fixed file to iCloud Drive and opening it from the Files app on my phone but for some reason that doesn’t work (Pedometer++ launches but nothing is imported).
Finally, checking the achievements in Pedometer++ reveals that the fix worked: