Sunday, November 8, 2015

Comparing the "nonAP" Dexcom G4 with the "505 AP" Dexcom G4

I am delaying, once again, the RR variability ECG post: while it was probably one of the most enjoyable thing I did, both from a "minor hacking" and from a learning point of view, it probably doesn't interest many readers.

G4 "non AP" vs the G4 "505 AP"


As you probably know if you follow the CGM world, the Dexcom G4 is currently available with two different algorithms.

The first one which, for convenience I will call "non AP" can be found in all G4 receivers sold outside the US and in all "pediatric" receivers currently sold in the US. It is the original algorithm the G4 used when it was first released. 

The second one which I will call the "505" can be found in the firmware updated original receivers, the adult "share" receivers and, of course, the newly released G5.

It is widely assumed that the "non AP" algorithm relies on an average of the previous secondary raw values it as received while the "505" algorithm is, at least in part, influenced by the collaboration Dexcom had with the Padua University and tends to use secondary raw values more aggressively than the "non AP" algorithm, at least when they are marked clean...

Why only now?

 

Since I live in Belgium, the "505" algorithm has not yet officially been made available to us: it will, when the G5 hits our shores, which could be any time soon... or not. To be honest, I could and probably should have made this comparison earlier. By the way, I want to take this opportunity to thank reader "D." who offered to send me a "505" share receiver as soon as it was released (but I was busy with the Libre back then), reader "J." who offered to send G5 sensors for my "mad scientist experiments" and the many readers who offered tips on how to bypass the restrictions and re-flash the non US G4 firmware to the latest version. The T1D communities I have joined are full of wonderful people.

So why take the bait now? The first reason is that is that reader's "K." offer to send me a G4 Share came at the right moment... exactly when Dexcom was releasing the G5. The second reason is that I don't plan to upgrade (or should I say downgrade) to the G5 any time soon, or until I have no choice. While the G5 runs the 505 algorithm, it would, at this point, be a step forward and three steps back from my point of view.

Test Setup and Limitations

 

  • We decided to run a "non AP" and a "505" version side by side. Max was extremely reliable during the test and both receivers were with him at all times. There's a very small difference in the number of packets received (data below).
  • Both receivers were started at the same time and have been calibrated, per manufacturer's instructions, at the exact same second using both hands to press OK simultaneously.
  • Limitation: we did not use our non standard - but for us optimal - calibration strategy. That strategy has served us well with the "non AP" but I wasn't sure it would help the "505" as much. We skipped it to keep the field as even as possible.
  • The sensor we used lasted the whole seven day period but will not be remembered as the best sensor we ever had. Post insertion, it showed quite a bit of that oscillation/secondary level noise on data marked as clean. 
  • Our ISIG profile during the period wasn't probably a typical Type 1 Diabetic profile. That may have limited the benefit we derived out of the "505"
  • While I try to do my best not to make unsubstantiated claims and present only data I have enough confidence to use for myself, keep in mind I have one subject and one glycemic profile. I exclude data I could not defend (see the accuracy comment below), I go through a ton of double checks, confidence factor calculations and other goodies on my data set but my goal is NOT to publish rock solid authoritative stuff. I just want to look at things less subjectively than what is seen in the average user report.

 

 Results 

 

!!! important note: when reading chart, keep in mind that the data is interpolated every 30 seconds between actual data points !!!
The results given by both sensors were extremely close. The Pearson correlation coefficient for the whole period was 0.97559. Most of the discrepancy came from the first day post insertion where the "505" would happily display "clean" but jumpy data whereas the "non AP" would average the jumps out. Here is a zoom of one of such events.

The "505" may actually have spotted the rise sooner that the "non AP" but it spoiled its advantage by tracking the jumpy secondary raw too closely because it was marked "clean".

Here is the whole period display.
The first thing to note is that, when calibrated at exactly the same time, the two algorithms will produce results that are extremely close. Except for the startup issues shown above, we could not find a single situation where one algorithm became so confused that it differed markedly from the other.

The second aspect I wanted to look at was the speed of reaction - how much faster was the "505" compared to the "non-AP"? In a similar comparison, the Libre beat the G4 non AP by no less than 9 minutes (this correlates well with the Dexcom reported "G4 vs YSI" data and the Abbott reported "Libre vs YSI" data). This type of comparison or time delay determination is usually done by shifting the signal and finding the best correlation.

In this test average, the "505" beat the "non AP" by only 3 minutes (6 period of 30 seconds).  Cherry picking periods of rapid changes, I could find runs where the optimal correlation time shift was 5.5 minutes. I could also find runs where it lagged (on average) by 30 seconds. I decided not to cherry pick as it is a slippery slope: selective cherry picking could be used to demonstrate anything. Here is the zoom on a tennis afternoon where we keep falling and correcting as the exercise went on (keep in mind that in this zoom, there are again 10 points for a standard Dexcom data point). This particular exercise occurred 20 hours after insertion and the jumpiness of the signal is still somewhat present. In some circumstances, such as the last fall and rise, the 505 clearly reacted more quickly and accurately. But the first peak is clearly a draw.



Here  is another 23 hours period where the "505" algorithm is mostly (on average), but not always, ahead of the "non AP" one (on average, one minute or two 30 seconds readings).
 

Again, two points worth noting
  • as far as use in sports is concerned, the "505" is marginally better than the "non AP". However, that improvement is not nearly as spectacular as it is with the Libre and, in particular, the Libre spot checks (which I assume to be in part predictive).  While we could play a whole tennis tournament using the Libre as only BG measuring tool, we were forced to use BG tests to check that out re-carbing was sufficient during tennis training sessions. 
  • the "data reality" differs markedly from our perception. I believe this is caused by the following subjective factor: if the "505" picks up a trend faster than the "non AP", it will remain ahead for the rest of the trend and the user who compares will be constantly reminded that the "505" is ahead. That creates a positive reinforcement and falsifies our perception a bit.
While I was a bit disappointed - I expected something like a systematic 5 minutes and an optimal 7.5 minutes advantage for the "505" - by the numbers, let's keep in mind that every occasion where the 505 is ahead is a bonus for the user: knowing you are falling more quickly than you expect 5 mins in advance is significant, knowing your re-carbing out of hypo worked 5 mins earlier is reassuring.

Accuracy

 

As I said above, our sensor wasn't a stellar performer (we averaged a 14% MARD for both algorithms, which is on the very low end for us). This is why I won't provide a detailed accuracy analysis but just impressions: in a "gut feeling but not statistically significant way" I'd say that the "505" showed better accuracy in the low range (below 80 mg/dl) but worse accuracy in the high range (above 150 mg/dl). This is somewhat visible in the global view where you can see the "non AP" climb above the "505" on several occasions: in all the cases we tested, the "non AP" was closer to the BG meter. Both the "non AP" and the "505" underestimated the BG value.

Closing Thoughts

 

The "non AP" approach is better for the possibly unstable conditions that characterize post-insertion. The "505" algorithm is generally better in all other cases, especially in the low range. It will, in most fast changing conditions, flag the rise or the fall more quickly than the "non AP".

This being said, the 5 minutes sampling frequency severely limits the CGM usefulness for sports or any other activity where ups and downs are to be expected. While Dexcom seems to have cornered the market at the moment (late 2015) I can't help thinking about how much better the Libre sensor...
  • the fact that the Libre can be factory calibrated means that, in most cases, Abbott is able to produce and characterize sensors more accurately and consistently than Dexcom.
  • the fact that the Libre is able to deliver a decent raw value, from close to off the shelf TI components,  almost every minute while Dexcom needs 5 minutes to deliver a secondary raw value that it does not always characterize accurately (jumpy secondary raw data marked as clean)
  • the fact that the Libre, in our experience, shows almost no drift until around the 12th day of its wear period
seem to indicate that, at the core, they have a much better technology.

In the world of my dreams, Abbott would have a very good customer service, would tell customers that it steals their data upfront, would have no supply chain issues and would provide more user friendly remote data transmission.

About once a month, the conspiracy theorist in me wonders what exactly has been agreed between Dexcom and Abbott when they dropped their mutual lawsuits just before the Libre hit the market...

Even if Abbott is prevented from going full CGM by itself by some agreement, I still hope that when the supply chain issues are resolved, some slight Libre hardware modification could increase the transmission range somewhat in a way that could be practically exploited by the community to develop what Abbott doesn't want or can't put on the market...

Additional Data (and minimal comments)

 

Desync is equal to:   0:02:55
(clocks were desynchronized at start, packets were realigned)
New start after resyncG4 Non AP   :   2015-10-30 19:49:41
G4    505        :   2015-10-30 19:49:41
(lets force synchronization)
New 505 end after period selection :     2015-11-06 17:43:58New nAP end after period selection:     2015-11-06 17:44:19
(sensor was stopped a bit early for a more convenient restart/reinsertion)
WTF, out of sync by 0:00:21
(receiver internal clocks drifted by 21 seconds over seven days - resynchronizing by 3 secs per day)
(a better approach would probably be to use identical time buckets, on the to-do list as impact is minimal)
Length of period 9954 mins - should have 1990 packets
Length of nAP 1939
Length of 505 1945
(nAP lost 51 packets, mostly water submersion during bath)
(505 lost 45 packets, same reason, no statistical difference)

Start Time: 2015-10-30 19:49:41
Correlation: (0.97559478434207947, 0.0)
505 Mean 106.337201125
G4 Mean 107.807866184
(mean values for the period almost identical, so are SD and other indices, not shown here)

(best correlation data shown below - the differences were so small that I switched to a 30 seconds resolution for the virtual sensor)
Shifting G4 non AP data by 0.5 minutes
Correlation: 0.97679314605
Shifting G4 non AP data by 1.0 minutes
Correlation: 0.9775384715
Shifting G4 non AP data by 1.5 minutes
Correlation: 0.978129017586
Shifting G4 non AP data by 2.0 minutes
Correlation: 0.978566125344
Shifting G4 non AP data by 2.5 minutes
Correlation: 0.978851137811
Shifting G4 non AP data by 3.0 minutes
Correlation: 0.978985400009
Shifting G4 non AP data by 3.5 minutes
Correlation: 0.978970258929
Shifting G4 non AP data by 4.0 minutes
Correlation: 0.978807063511
Shifting G4 non AP data by 4.5 minutes
Correlation: 0.978497164627
Shifting G4 non AP data by 5.0 minutes
Correlation: 0.978041884366
Shifting G4 non AP data by 5.5 minutes
Correlation: 0.977444434606
Shifting G4 non AP data by 6.0 minutes
Correlation: 0.97670490467
Shifting G4 non AP data by 6.5 minutes
Correlation: 0.975823208612
Shifting G4 non AP data by 7.0 minutes
Correlation: 0.974799259945
Shifting G4 non AP data by 7.5 minutes
Correlation: 0.97363297164
Shifting G4 non AP data by 8.0 minutes
Correlation: 0.972324256124
Shifting G4 non AP data by 8.5 minutes
Correlation: 0.970873025277
Maximum correlation  0.978985400009 with delay  3.0 mins


No comments:

Post a Comment