Published: 2014-09-16 | Categories: [»] Engineering, [»] Electricity & Electronicsand[»] Programming.

In a [»] previous post about the implementation of RS232 connectivity for the PIC to a computer, I briefly talked about a method that could be used to substantially increase the transfer speed between the microcontroller and the computer. At the time it was only a theoretical procedure but I have now validated it through experimentation.

I will propose here to build a 11-bits per sample, 5 kHz oscilloscope/data-logger unit based on a PIC16F688 with a 20 MHz crystal. I will not cover the data acquisition part of the circuit which has already been discussed in the [»] analog resolution doubler circuit post and I will solely focus on how to achieve maximum transfer speed through a technique known as task switching. I will briefly cover hardware interruptions and show how to use them in a quite elegant way.

Building an oscilloscope is relatively simple in its principle because all one has to do is to record analog voltages using an analog-to-digital converter and send it to a computer for display/storage. Some oscilloscopes have fancy features such as gain selection, multiple inputs, high resolution and so on. Here, I will assume a very basic oscilloscope with only one channel input and no gain selection. But you’ll see that it will already be full of surprises.

The basic implementation follows what we just said: read a voltage and send it through the communication channel. However, I have shown in the [»] RS232 implementation post that sending the data as-is is not such a good idea because the communication channel is not safe to data corruption nor data loss. By adding an identification field and a checksum field, we can assess that the data is not corrupted. We may also add a counter field to track data loss: data is considered lost if the clock field from the current packet minus the clock field from the previous packet is greater than one. This allows computing the correct time for each packet by multiplying the clock value by the acquisition speed which is fixed by the microcontroller. Ideally, missed data should not be displayed but they may be required for finite time delta computations such as in Fourier Transforms; if you run into such a case, you may interpolate missing data from the surrounding known ones.

With an 8-bits identification, an 8-bits checksum, an 8-bits clock and an 11-bits data payload, we end up with 5-bytes packets which requires about 0.48 ms to be sent at 115200 bauds. Considering the time required for analog-to-digital conversion (minimum 0.05 ms on a PIC16F688) and the checksum computing (let’s say 0.01 ms for 5 bytes), each sample would require about 0.54 ms to be sent which limits the overall speed to about 1.8 kHz. And that’s all! It’s the maximum speed achievable for an 11-bits data with 5-bytes packets at 115200 bits per seconds. We may increase the speed by a little by packing data such that it fits into a 4-bytes packet but it would only allow going as high as 2.2 kHz. In practice, I have found that the ADC conversion speed is never that good when operating with C compilers so the real speed you would obtain when implementing that will be somewhat lower, like ~1.6 kHz.

But I promised you a 5 kHz acquisition speed, so where is the trick?

If you look at our basic implementation, you will notice that we make a very bad usage of communication bandwidth in terms of payload data because only 30% of the packet contains the sample data bits (35% when packed). If we could add more samples data into one packet, we could save the identification, checksum and clock field for several samples and increase the information ratio in our packets. For example, if we send two samples in one packet we would have 7-bytes packets containing 22-bits of information (40% when packed); 7-bytes would require 0.67 ms to transfer, about 0.1 ms data conversion (2 samples) and let’s say about 0.015 ms for the checksum which make a total of 0.78 ms for two samples and so an average 2.5 kHz transfer speed per sample! So even without packing, we have increased our theoretical limit speed from 1.8 kHz to 2.5 kHz just by grouping samples into a single packet. The situation gets even better as we keep adding more samples such as represented on Figure 1.

Figure 1

With an infinite number of samples per packet, we reach a limit case where the sizes of the identification, checksum and clock field become meaningless in regards to the samples data. In our case, the limit speed is about 5.2 kHz for 2-bytes data (7.6 kHz when the samples are packed together). It is interesting to note that 95% of that theoretical value is already reached with 29 samples per packet. In practice, we will be limited by the memory of the microcontroller which can be quite small (a few hundred bytes in most cases).

However, this is too beautiful to be that simple because we cannot just acquire N samples and then transmit them since we would have burst of data followed by silences such as represented on Figure 2.

Figure 2

On Figure 2 we can see some signal that is being read by the ADC for 12 samples period which are then send to the computer before the next 12 samples are recorded. During that transfer time, samples are missed. This is clearly something that you want to avoid because it will ruin any Fourier Transform algorithm which requires constant time steps and it is also a rather stupid behaviour for an oscilloscope to miss half of its data!

The solution is to use two buffers such that one buffer is filled with data while the second buffer is being sent to the computer. That way, no data are lost. The mechanism is inspired from 3D Graphics technology where awesome 3D scenes are rendered to a front buffer (a copy of your screen display in the RAM) while your screen is actually displaying the content of a back buffer. When the scene has finished displaying into the front buffers, the two buffers are swapped such that the front buffer now becomes the back buffer and vice versa. In 3D displays, this allows prevention of flickering.

I will use here the same terminology, and so a front buffer will be filled with the data being read by the ADC unit while a back buffer loaded with previous samples will be sent to the computer. The only requirement is that the back buffer empties faster than the speed at which the front buffer is being filled (in the other case, we would have a buffer overrun condition). In terms of implementation, we create two buffers (called “Buffer 1” and “Buffer 2”) and two pointers which are the front and back buffer pointers. To swap the front/back buffer, we will then only have to swap the pointers which is much faster than actually swapping all the buffered data. The new acquisition is represented on Figure 3.

Figure 3

On a computer, we would simply implement this using two threads which would run concurrently in an asynchronous way. However, PIC microcontrollers do not have asynchronous capabilities so we have to use a second trick (which is the core trick of this article by the way). The idea is to use hardware interruptions of the microcontroller to create a task switching pattern.

But let’s first review how interruptions work.

Hardware interruptions (or interrupts) are a mechanism that makes the microcontroller jump to a fixed address when some conditions are met. It’s like when you are making coffee and someone knocks on the door: you stop thinking about your coffee cup and go answer. In the case of the microcontroller, it’s exactly the same but with the microsecond precision. It really jumps to the interruption address at the exact moment the condition occurs, not when it considers that it’s idling or that it is now okay to cooperate.

Microchip has implemented a lot of various interrupt conditions in its microcontrollers such as when data is being received by communication channels, analog conversion is done, the timer overflows, ... Here I will use two of them: timer overflow interruptions and “analog conversion done” interruptions. I will set the Timer 0 to create an interruption every 204.8 µs (1024 cycles at 20 MHz) and launch an analog conversion on that occasion. Then the processor will idle until the analog conversion done interruption is triggered so the data can be filled into the front buffer. The processor will then idle until the next timer overflow occurs. The “idle time” will then be used to send the back buffer to the computer. The mechanism is displayed on Figure 4.

Figure 4

That way, data will be sampled every 1024 clock cycles which makes a 4.88 kHz acquisition speed. Technically speaking it is possible to create a timer overflow interruption at faster rate by setting the Timer 0 to a known value after each overflow (for example to 0x06 to create an interruption every 1000 cycles to achieve a pure 5.00 kHz) but this require a precise control of the instructions time which is difficult to get when using a C compiler. When leaving the Timer 0 as it is, we are guaranteed to have a precisely known clock frequency of 5 MHz / 1024. This is the reason I’ve used a 4.88 kHz and not the 5.00 kHz although the circuit could go that far (which I measured to be true).

The code itself is relatively easy and is given below:

#include <pic.h> #define null 0 /* pin RC0 used for the negative voltage comparator aka 11th bit */ #define NEGBIT RC0 bit g_bNegative; /* 37x samples per packet */ #define NUM_BYTES_PER_PACKET 74 /* identification field */ #define PACKET_DATA_IDENT 0xC3 /* 8-bit ident, 8-bit cksum, 4-bit clock, 4-bit buffer over-run record, 37 samples */ struct packet_data_s { unsigned char ucIdent; unsigned char ucChecksum; unsigned char ucTick:4; unsigned char ucSkipped:4; unsigned char ucPackedData[NUM_BYTES_PER_PACKET]; }; /* current byte in the buffer data */ unsigned char g_ucCurrentByteIndex = 0; /* front and back buffers */ struct packet_data_s g_buffer1, g_buffer2; struct packet_data_s *g_pFrontBuffer = &g_buffer1, *g_pBackBuffer = null; /* init communication channel at 115200 baunds for the PIC16F688 */ void io_init(void) { TRISC5 = 1; TRISC4 = 1; SYNC = 0; BRGH = 1; TX9D = 0; SPEN = 1; SPBRG = 10; TX9 = 0; RX9 = 0; TXEN = 0; CREN = 1; TXIE = 0; RCIE = 0; } /* write a byte to RS232 */ void io_putc(char c) { TXEN = 1; TXREG = c; while(!TRMT); } /* write ucLength bytes to RS232 */ void io_write(unsigned char *pucBytes, unsigned char ucLength) { while(ucLength--) io_putc(*(pucBytes++)); } /* compute 8-bits checksum */ unsigned char checksum8(unsigned char *pucData, unsigned char ucLength) { unsigned char cksum = 0; while(ucLength--) { cksum += ~(*pucData); pucData ++; } return ~cksum; } void init(void) { /* turn off comparator */ CMCON0 = 0b00000111; /* A0 as analog pin and enable analog conversion done interrupts */ TRISA0 = 1; ANSEL = 0b00000001; ADCON1 = 0b00100000; ADIE = 1; PEIE = 1; /* neg bit as input */ TRISC0 = 1; /* timer 0 sets to trigger interruptions every 1024 cycles */ T0CS = 0; PSA = 0; PS0 = 1; PS1 = 0; PS2 = 0; T0IE = 1; } void interrupt ctrl(void) { /* record negative bit state and start conversion every 1024 cycles */ if(T0IF) { T0IF = 0; ADCON0 = 0b10000101; ADCON0 |= 0b10; g_bNegative = NEGBIT; return; } /* conversion done */ if(ADIF) { ADIF = 0; /* if front buffer is full */ if(g_ucCurrentByteIndex >= NUM_BYTES_PER_PACKET) { /* buffer over-run detection */ if(g_pBackBuffer != null) { if(g_pFrontBuffer->ucSkipped < 0xf) g_pFrontBuffer->ucSkipped ++; /* skip data untill back buffer is sent */ return; } /* swap buffers */ g_pBackBuffer = g_pFrontBuffer; g_pFrontBuffer = (g_pFrontBuffer == &g_buffer1) ? &g_buffer2 : &g_buffer1; /* init buffer vars... */ g_pFrontBuffer->ucSkipped = 0; g_ucCurrentByteIndex = 0; } /* write analog data to the front buffer */ g_pFrontBuffer->ucPackedData[g_ucCurrentByteIndex] = ADRESH & 3; g_pFrontBuffer->ucPackedData[g_ucCurrentByteIndex++] |= ((unsigned char)g_bNegative) << 7; g_pFrontBuffer->ucPackedData[g_ucCurrentByteIndex++] = ADRESL; return; } } void main(void) { /* initialize stuff... */ g_bNegative = 0; g_buffer1.ucIdent = g_buffer2.ucIdent = PACKET_DATA_IDENT; g_buffer1.ucSkipped = g_buffer2.ucSkipped = 0; init(); io_init(); /* enable interruptions once everything is initialized */ GIE = 1; /* current clock */ unsigned char ucTick = 0; while(1) { /* if back buffer available */ if(g_pBackBuffer == null) continue; /* write current tick and checksum */ g_pBackBuffer->ucTick = ucTick++; g_pBackBuffer->ucChecksum = 0; g_pBackBuffer->ucChecksum = checksum8((unsigned char*)g_pBackBuffer, sizeof(struct packet_data_s)); /* send */ io_write((unsigned char*)g_pBackBuffer, sizeof(struct packet_data_s)); /* flag back buffer as empty */ g_pBackBuffer = null; } }

It is a retranscription of the concepts described above. I have added a buffer overrun detection which tells the computer if some data are skipped because the sending operations were too slow for the acquisition speed. I found this quite useful when developing the code because it allowed me to check if the data acquisition speed (i.e.: the Timer 0 overflow rate) matched or not the transmission speed.

Concerning the code for the computer, it is based on the post I wrote about [»] RS232 communication and is left without commentary:

virtual void onReceive(struct packet_s *packet) { static unsigned char ucLastTicks = 0xff; unsigned char ucTicks = packet->ucTick & 0xf; unsigned char ucSkipped = (packet->ucTick >> 4) & 0xf; if(ucLastTicks == 0xff) ucLastTicks = ucTicks; this->m_ulTicks += ucSkipped; unsigned long ulTicksDelta; if(ucTicks < ucLastTicks) ulTicksDelta = (ucTicks + 16) - ucLastTicks; else ulTicksDelta = ucTicks - ucLastTicks; ucLastTicks = ucTicks; if(ulTicksDelta > 1) this->m_ulTicks += (ulTicksDelta - 1) * (NUM_BYTES_PER_PACKET >> 1); for(int i=0;i<NUM_BYTES_PER_PACKET;i+=2) { unsigned short usAnalog = (unsigned short)packet->ucPackedData[i+1] + (((unsigned short)packet->ucPackedData[i]) << 8); double fTime = (0.2e-6 * 1024.0) * (double)this->m_ulTicks; double fValue = (5.0 / 1024.0) * (usAnalog & 0x3ff); if(usAnalog & 0x8000) fValue = -fValue; addPoint(fTime, fValue); this->m_ulTicks ++; } }

Oh, by the way, the signal data from Figure 2 and Figure 3 are actual data got from a [»] 250 Hz sine generator circuit so you can see it’s working pretty fine ;-)

Upgrading the oscilloscope/data logger performances

I will now conclude this post with a few comments on the kind of upgrades that are possible but not mandatory for the comprehension of the technique.

Getting it faster

You may want to achieve faster acquisition speed than 5 kHz. This is fine, but remember that you are bounded to the theoretical maximum transfer speed which is of 5.2 kHz at 115200 baunds. There are still a few ways to increase it by:

1/ Packing data. By packing 11-bits samples data it should be possible to transfer at 7.6 kHz. I have spent a few hours trying to make this work but the code required to pack the data when filling the front buffer was so complex in assembly instructions that the idling time between interruptions became too small to transfer the back buffer without overrun. In practice, data packing did lower the actual acquisition speed under the unpacked speed so unless you can write good asm code, forget about it.

2/ Using smaller sample data. Technically speaking, the 5.2 kHz limit is valid for 2-bytes data but if you are happy with 1-byte data (8-bits resolution) then you can get twice as much data without changing anything. With our 4.88 kHz circuit, this could go as high as 9.8 kHz by setting the Timer 0 interruptions to 512 cycles (PS0=0 instead of PS0=1)! I have also tried this out but, unfortunately, did not get good results. Again, the idling time became too small to transfer the data and I had about 10 skipped data (~1 ms) per packet. But I’m sure this can be fixed by proper optimization of both the sending and acquisition part of the code since the compiler I’ve used (PICCLITE) does not perform very well in its basic version. 10 kHz would be quite nice so it worth the trial!

3/ Using better hardware. This is the only solution left when everything has been optimized! By replacing the MAX232 (which is limited to 115200 bauds) with other chips that support up to 1 Mbauds we can reduce the transmission speed to a fraction of what it currently is and so make more samples without having to worry about the idling time between interruptions. Also, replacing the PIC16F688 by one of those PIC18 with PLL clocks twice-as-fast, more bits of resolution for the ADC an larger memory can make a big difference. I’m sure we can go up to 15-bits samples at 10 kHz under all these conditions... but this requires a lot of change so it is not the easiest solution available.

Getting it slower

While it is an obvious choice to try increasing the maximum acquisition speed, the question of lowering it arises when we need to study much slower systems. In general, I would recommend to always use the microcontroller to achieve the highest speed possible and to reduce the actual sampling time by software on the computer through an averaging procedure. This has the benefit of reducing the noise by the square root of the number of samples averaged.

Here is a code snippet that you can use:

double m_fLastTime = 0, m_fIntegrationTime = ...; double m_fBuffer = 0, m_fBufferN = 0; void addPoint(double fTime, double fValue) { this->m_fBuffer += fValue; this->m_fBufferN ++; if(fTime - this->m_fLastTime < this->m_fIntegrationTime) return; this->m_fLastTime = fTime; fValue = this->m_fBuffer / this->m_fBufferN; this->m_fBuffer = 0; this->m_fBufferN = 0; printf("\r%.3f Volts ", fValue); }
Having more analog channels

Recording more than one analog channel is not trivial because you should always pay attention that the data is converted at the same time, especially when you study the dynamics of very fast systems (near the acquisition speed). For that reason, I would recommend to build one circuit (with its RS232 connector) per analog channel and to synchronize all the circuit such that they sample values at exactly the same time. It is a more expensive solution because you have to duplicate your circuit, USB connector... but it is also easier to scale-up because you could imagine plugging 10 circuits into an USB hub and have 10 channels on screen with one generic software.

In terms of implementation, the idea would be to separate the circuit into a master mode and a slave mode. The first circuit to be run would be the master circuit which generates a pulse every 1024 cycles to the INT0 (“Interrupt 0” interrupt) pins of the other circuit which would be the slave ones. You may use BNC cable to link circuits to each other or even fiber systems. The great thing about the master/slave approach is that all the circuits will be synchronized at the sub-microsecond level, especially if you connect the master pulse output to its own INT0 pin.

If you consider implementing this, pay also attention to the final remark.

Important note about grounding

There is one last important thing to note about how the circuit is grounded. Even when operating on a battery, never forget that your oscilloscope circuit is connected to your computer ground. So, if your computer is on the general power supply and the system you are currently analyzing is on the same general power supply, trying to connect the oscilloscope ground to anything but the ground of the system you are studying (such as when trying to make differential measurements) will severely damage the hardware. To prevent this, always operate the oscilloscope on battery and plug the computer power supply out of your laptop (this will obviously not work with desktop computers...).

If you consider connecting several oscilloscope together, you will have to handle true-differential inputs by using instrumentation amplifiers or equivalent solutions.

[⇈] Top of Page

You may also like:

[»] Data Communication With a PIC Using RS232

[»] In-line Absorption and Fluorescence Sensor

[»] OpenRAMAN LD & TEC Drivers

[»] Sine Wave Oscillator with Fewer Op-Amp

[»] DIY Conductometry