Change Up - Moving Averages in 'C'
Recently my inexpensive, archaic and inaccurate weather station started tossing out spurious values at a rate that grew from mildly annoying to "really pizzing me off!"After putting a little bit of research in on the types of errors, it appeared that some sort of "flyer" control algorithm would help. The easiest of these is the Moving Average (also called Running Average, Rolling Average, Sliding Average, Sliding Window Average).
And I needed it to be coded in 'C'. After 15 minutes of Googling, I decided to roll my own.
And here's the result.
Moving Average - Header File - Function Prototypes.
extern MovingAverage_t *MovingAverage_CreateMovingAverage( int maxElements );
extern int MovingAverage_AddValue( MovingAverage_t *maPtr, double value );
extern double MovingAverage_GetAverage( MovingAverage_t *maPtr );
extern int MovingAverage_GetElementCount( MovingAverage_t *maPtr );
extern int MovingAverage_Reset( MovingAverage_t *maPtr );
extern int MovingAverage_DestroyMovingAverage( MovingAverage_t *maPtr );
extern int MovingAverage_Resize( MovingAverage_t *maPtr, int newMaxElements );
extern int MovingAverage_GetValues( MovingAverage_t *maPtr, double returnValues[] );
extern void MovingAverage_DebugDump( FILE *fp, MovingAverage_t *maPtr );
The first call to CreateMovingAverage takes a number of elements to hang onto. Pass in '5' and the moving average keeps 5 values around. It returns a pointer to a structure of type "MovingAverage_t". You hang onto that structure and use it in the subsequent calls.
Here's the structure definition:
typedef struct MovingAverageStruct {
BufferElement_t *head;
int maxElements; // total size of the ring buffer (eg. 10))
int elementCount; // how many elements (values) are in the buffer. Will range from [0 to maxElements ]
int index; // next spot to use for a new value in the buffer [ 0.. maxE ]]
double runningSum; // we keep a running total
double currentAverage;
} MovingAverage_t;
The values to be averaged are stored as a Linked List. Here's the definition for BufferElement_t:
typedef struct BufferElement {
double value; // the value
struct BufferElement *next; // needed for singly- or doubly-linked lists
unsigned long sequence; // debugging only...
} BufferElement_t;
There are a number of ways to store the values, typically a Ring Buffer is used. And I could have implemented the Ring Buffer using an array, but I wanted the ability to easily grow the buffer size at run time, if necessary. So some sort of dynamic structure is called for, and a Linked List seemed like a reasonable first start.
Troy Hanson's UTLIST.H
A small diversion to recognize a fellow by the name of Troy Hanson (see his wordpress blog).I've used Troy's code before and it works wonderfully. His code is here.
The most amazing thing about Troy's work is that he's done it all using C Preprocessor Macros!
That means there's no runtime dependency on a library, shared or static. Just include his header file, follow his coding conventions and you've got a fully functional Linked List as easy as 1, 2, 3!
Troy is, by far and away, the best preprocessor guru I've run across. Thank you Troy, for your great work.
Back to the code, the only thing Troy's macros need for a singly linked list is a variable named "next" that points to the next list item. Don't change the variable name -- it must be called "next".
Here's the complete header file:
/*
* File: movingaverage.h
* Author: pconroy
*
* Created on January 19, 2015, 8:46 AM
*/
#ifndef MOVINGAVERAGE_H
#define MOVINGAVERAGE_H
#ifdef __cplusplus
extern "C" {
#endif
//
// Going to use Troy Hanson's Excellent LinkedList Macros
#include <stdio.h>
#include <UtHash/utlist.h>
#ifndef FALSE
# define FALSE 0
# define TRUE (!FALSE)
#endif
#define MASUCCESS 0
#define MAFAILURE 1
/*
* You can use any structure with these macros, as long as the structure contains a next pointer.
* If you want to make a doubly-linked list, the element also needs to have a prev pointer.
*/
typedef struct BufferElement {
double value;
struct BufferElement *next; /* needed for singly- or doubly-linked lists */
unsigned long sequence; // debugging
} BufferElement_t;
typedef struct MovingAverageStruct {
BufferElement_t *head;
int maxElements; // total size of the ring buffer (eg. 10))
int elementCount; // how many elements (values) are in the buffer. Will range from [0 to maxElements ]
int index; // next spot to use for a new value in the buffer [ 0.. maxE ]]
double runningSum; // we keep a running total
double currentAverage;
} MovingAverage_t;
extern MovingAverage_t *MovingAverage_CreateMovingAverage( int maxElements );
extern int MovingAverage_AddValue( MovingAverage_t *maPtr, double value );
extern double MovingAverage_GetAverage( MovingAverage_t *maPtr );
extern int MovingAverage_GetElementCount( MovingAverage_t *maPtr );
extern int MovingAverage_Reset( MovingAverage_t *maPtr );
extern int MovingAverage_DestroyMovingAverage( MovingAverage_t *maPtr );
extern int MovingAverage_Resize( MovingAverage_t *maPtr, int newMaxElements );
extern int MovingAverage_GetValues( MovingAverage_t *maPtr, double returnValues[] );
extern void MovingAverage_DebugDump( FILE *fp, MovingAverage_t *maPtr );
#ifdef __cplusplus
}
#endif
#endif /* MOVINGAVERAGE_H */
Let's quickly revisit the functions exposed by the code. We already covered off on CreateMovingAverage().
AddValue() is called when you have a new value (of type double) to add to the running average. If the buffer is not full, (element count < max elements) then the value is appended to the linked list. If the buffer is full, then the oldest element is removed and the value appended. The buffer always stays, at most, max elements full.
GetAverage() returns the current moving average in the buffer. There's an assert() in the code to make sure the element count isn't zero. Don't call it until there's at least one value in the list.
GetElementCount() returns the number of values in the list. It'll range from 0 to maxElements passed into the create call.
Reset() deletes the linked list of values and resets all counters to zero. It does NOT resize the list - maxElements stays the same as when the Create() call was made.
DestroyMovingAverage() deletes the list, deletes any memory allocated and sets the maxElements value to zero. Pair it with the Create() call and call it when you're done using the moving average.
Resize() allows you to grow the ring buffer. You can make the buffer larger, store more values. Currently you can't shrink it. (Until I convince myself of the need and a way to keep 'n' values when shrinking, it isn't implemented.)
GetValues() lets you pass in a pointer to an array of doubles. The values are pulled out of the list and passed back. For debugging.
DebugDump() dumps the contents of the structure to a file for logging and debugging.
That's the header file. Let's move onto the source code:
/*
* File: movingaverage.c
* Author: pconroy
*
* Created on January 15, 2015, 2:50 PM
*/
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include "movingaverage.h"
static unsigned long globalSequenceCounter = 0UL;
//------------------------------------------------------------------------------
//
// Call this function first to obtain a pointer to a Moving Average structure
// The buffer will be ready to take new values after this call
MovingAverage_t *MovingAverage_CreateMovingAverage (int maxElements)
{
MovingAverage_t *maPtr = malloc( sizeof( MovingAverage_t ));
if (maPtr != (MovingAverage_t *) 0) {
maPtr->head = (BufferElement_t *) 0;
maPtr->elementCount = 0;
maPtr->index = 0;
maPtr->runningSum = 0.0;
maPtr->currentAverage = 0.0;
maPtr->maxElements = maxElements; // size of the ring buffer
}
return maPtr;
}
//------------------------------------------------------------------------------
//
// Add a new value (of type double) to the list. This will force a recalculation
// of the average
int MovingAverage_AddValue (MovingAverage_t *maPtr, double value)
{
if (maPtr != (MovingAverage_t *) 0) {
BufferElement_t *newElement = malloc( sizeof( BufferElement_t ));
if (newElement != (BufferElement_t *) 0 ) {
newElement->value = value;
newElement->sequence = globalSequenceCounter;
globalSequenceCounter += 1UL;
//
// If there's still room in the buffer, if we haven't wrapped around yet
if (maPtr->elementCount < maPtr->maxElements) {
LL_APPEND( maPtr->head, newElement );
maPtr->elementCount += 1;
maPtr->index += 1;
} else {
//
// Buffer is full, time to wrap around and over write the value at "index"
// Where will this new element go? Here:
int newElementSpot = (maPtr->index % maPtr->maxElements);
// First - advance to the element and get value that's already there
BufferElement_t *oldElement = (maPtr->head);
for (int i = 0 ; i < newElementSpot; i ++)
oldElement = oldElement->next;
//
// Remove the value from the running total
maPtr->runningSum -= oldElement->value;
//
// Swap out the old element with the new
LL_REPLACE_ELEM( maPtr->head, oldElement, newElement );
//
// Update the index, but it needs to wrap around
maPtr->index = ((maPtr->index + 1) % maPtr->maxElements);
//
// Free up the memory used by the old, discarded element
free( oldElement );
}
maPtr->runningSum += value;
assert( maPtr->elementCount != 0 );
maPtr->currentAverage = ( maPtr->runningSum / maPtr->elementCount );
} else {
return MAFAILURE;
}
}
return MASUCCESS;
}
//------------------------------------------------------------------------------
//
// Return the average of all of the values in the buffer
double MovingAverage_GetAverage (MovingAverage_t *maPtr)
{
assert( maPtr->elementCount != 0 );
return maPtr->currentAverage;
}
//------------------------------------------------------------------------------
//
// Return how many values are currently in the buffer
int MovingAverage_GetElementCount (MovingAverage_t *maPtr)
{
return maPtr->elementCount;
}
//------------------------------------------------------------------------------
//
// Clear the contents of the list of values, free the memory, reset all sums and counts
// to zero. Do NOT free the head of the list, do NOT resize the list
int MovingAverage_Reset (MovingAverage_t *maPtr)
{
BufferElement_t *ptr1;
BufferElement_t *ptr2;
LL_FOREACH_SAFE( maPtr->head, ptr1, ptr2 ) {
free( ptr1 );
}
maPtr->elementCount = 0;
maPtr->currentAverage = 0.0;
maPtr->index = 0;
maPtr->runningSum = 0.0;
return MASUCCESS;
}
//------------------------------------------------------------------------------
//
// Call Reset() to clear and free memory, then free the head ptr and set the list size to
// zero. The list should not be reused after this
int MovingAverage_DestroyMovingAverage (MovingAverage_t *maPtr)
{
MovingAverage_Reset( maPtr );
free( maPtr->head );
maPtr->maxElements = 0;
return MASUCCESS;
}
// -----------------------------------------------------------------------------
int MovingAverage_Resize (MovingAverage_t *maPtr, int newMaxElements)
{
//
// For now - we're only going to allow larger, not smaller
assert( newMaxElements > maPtr->maxElements );
//
// If we're increasing the size of the list - that's easy to do!
if (newMaxElements > maPtr->maxElements) {
maPtr->maxElements = newMaxElements;
//
// We're Not going to adjust the current index
} else {
//
// Shrinking is harder! -- Save the "newMaxElement" values, and drop the rest
//
}
return MASUCCESS;
}
//------------------------------------------------------------------------------
int MovingAverage_GetValues (MovingAverage_t *maPtr, double returnValues[])
{
BufferElement_t *ptr;
int index = 0;
LL_FOREACH( maPtr->head, ptr ) {
returnValues[ index ] = ptr->value;
index += 1;
}
return maPtr->elementCount;
}
//------------------------------------------------------------------------------
void MovingAverage_DebugDump (FILE *fp, MovingAverage_t *maPtr)
{
fprintf( fp, "Current Structure Values\n" );
fprintf( fp, " Element Count: %d\n", maPtr->elementCount );
fprintf( fp, " Max Elements : %d\n", maPtr->maxElements );
fprintf( fp, " Index : %d\n", maPtr->index );
fprintf( fp, " Average : %f\n", maPtr->currentAverage );
fprintf( fp, " Running Sum : %f\n", maPtr->runningSum );
BufferElement_t *ptr;
LL_FOREACH( maPtr->head, ptr ) {
fprintf( fp, " Seq: %lu Value: %f\n", ptr->sequence, ptr->value );
}
fprintf( fp, "Global Sequence Counter: %lu\n", globalSequenceCounter );
}
#if 0
int main(int argc, char** argv)
{
MovingAverage_t *aMovingAverage = MovingAverage_CreateMovingAverage( 3 );
if (aMovingAverage != (MovingAverage_t *) 0) {
for (int i = 0; i < 10; i += 1) {
(void) MovingAverage_AddValue( aMovingAverage, (double) i );
}
}
MovingAverage_DebugDump( aMovingAverage );
//
// Make it bigger
printf( "Make the buffer bigger.\n" );
MovingAverage_Resize( aMovingAverage, 20 );
for (int i = 100; i < 110; i += 1) {
(void) MovingAverage_AddValue( aMovingAverage, (double) i );
}
printf( "After making the buffer larger.\n" );
MovingAverage_DebugDump( aMovingAverage );
MovingAverage_DestroyMovingAverage( aMovingAverage );
return (EXIT_SUCCESS);
}
#endif
That's it. In my weather station code, I created a function called "checkAndAddValue()" for each value emitted by the weather station:
static
int checkAndAddValue (weatherStats_t *wsPtr, int statType, double value, double percentage)
It's job is to check the value against the moving average. I pass in a percentage variance threshold. Values that are outside the threshold are discarded.
For example, the interior temperature check/add function looks like:
if (!checkAndAddValue_ITemp( wsPtr, datum->indoor.temp, 15.0 )) {
Logger_LogWarning( "Reading's indoor temperature appears to be in error. Will be discarded. Value [%f]\n", datum->indoor.temp );
return FALSE;
}
The interior temperature reading coming in, has to be within 15% of the moving average or it's discarded.
The percentage is increased for outside temperature values since those can move a bit faster:
if (!checkAndAddValue_OTemp( wsPtr, datum->outdoor.temp, 30.0 )) {
Logger_LogWarning( "Reading's outdoor temperature appears to be in error. Will be discarded. Value [%f]\n", datum->outdoor.temp );
return FALSE;
}
Maybe the next post will be about the simple logging framework I use.
No comments :
Post a Comment