Gnome Posted November 4, 2016 Posted November 4, 2016 Hi all Anyone else experiencing CRC mismatch every so often on their Axpert Inverter? That is, the data I get back has a CRC mismatch. So the Axpert will respond with data that looks correct, however the last part (CRC part) is wrong. Starting to wonder if it is the firmware or the cable or the serial port converter. I'm thinking I should put a device between the Axpert and my software to capture the raw serial data to see what's up. However when I look at the raw data my computer receives over serial, it does show the CRC is a mismatch. So I highly doubt it is my software. I'm pretty sure this didn't used to happen... But now I'm unsure Quote
Coulomb Posted November 4, 2016 Posted November 4, 2016 I've not noticed this, but haven't been looking closely. QPIGS has a very long response, and the response values change a lot. So it's difficult to spot if one data element has changed a little. Perhaps set up a small script or something to spit out regular shorter commands that have a fixed response, e.g. QID or QVFW. If they never seem to vary, then perhaps try larger responses with fixed data (e.g. QPIRI, ratings enquiry). If the CRC when wrong always has one character that is one more than you expect, then perhaps you;re not handling that situation where the CRC would end up with a carriage return or linefeed or open parenthesis; in these three cases, the CRC character is incremented (so CR (0x0D) to ^N (0x0E), LF (0x0A) to ^K (0x0B), and '(' (0x28) to ')' (0x29)). If you get errors, even increment by one, for other received CRCs, then either the CRC is being sent incorrectly or received incorrectly. To figure out which of these, I guess you need to try different computers, USB to serial adapters (if used), different and shorter cables (if relevant), and so on. Gnome 1 Quote
Guest Posted November 5, 2016 Posted November 5, 2016 We have seen some really ridiculous data in the database, Axpert and Victron, inverters and controllers. Actually have to start looking at ways to suppress the dud data. Quote
___ Posted November 5, 2016 Posted November 5, 2016 10 hours ago, The Terrible Triplett said: We have seen some really ridiculous data in the database, Axpert and Victron, inverters and controllers. Actually have to start looking at ways to suppress the dud data. I've seen an odd value with the Victron MK2 dongle where it shortened the response and dropped one of the delimiting values. In my own software I didn't check for the delimiter, simply assumed it would be there and used the values that followed it. The shortening meant I started using the checksum as half of a 16 bit value. Imagine my surprise when suddenly the small Multiplus was reporting discharge currents of 2000 amps! A simple unplug and replug fixed it. Quote
Guest Posted November 5, 2016 Posted November 5, 2016 6 minutes ago, plonkster said: A simple unplug and replug fixed it. After having had to do that once too many times, it ALWAYS happened when no-one is around, losing hours of data, we sorted it on all devices. In rare cases now do I not have to unplug - it is annoying with 5 devices. Quote
Gnome Posted November 6, 2016 Author Posted November 6, 2016 On 11/5/2016 at 1:03 AM, Coulomb said: I've not noticed this, but haven't been looking closely. QPIGS has a very long response, and the response values change a lot. So it's difficult to spot if one data element has changed a little. Perhaps set up a small script or something to spit out regular shorter commands that have a fixed response, e.g. QID or QVFW. If they never seem to vary, then perhaps try larger responses with fixed data (e.g. QPIRI, ratings enquiry). If the CRC when wrong always has one character that is one more than you expect, then perhaps you;re not handling that situation where the CRC would end up with a carriage return or linefeed or open parenthesis; in these three cases, the CRC character is incremented (so CR (0x0D) to ^N (0x0E), LF (0x0A) to ^K (0x0B), and '(' (0x28) to ')' (0x29)). If you get errors, even increment by one, for other received CRCs, then either the CRC is being sent incorrectly or received incorrectly. To figure out which of these, I guess you need to try different computers, USB to serial adapters (if used), different and shorter cables (if relevant), and so on. Hey Coulomb Thanks for taking the time to respond. Here is a failed example: Each number in the array is the ASCII character code. [40, 50, 52, 48, 46, 49, 32, 52, 57, 46, 57, 32, 50, 52, 48, 46, 49, 32, 52, 57, 46, 57, 32, 48, 50, 52, 48, 32, 48, 49, 56, 53, 32, 48, 48, 52, 32, 52, 51, 53, 32, 53, 52, 46, 48, 48, 32, 48, 48, 48, 32, 49, 48, 48, 32, 48, 48, 53, 53, 32, 48, 48, 48, 48, 32, 48, 48, 48, 46, 48, 32, 48, 48, 46, 48, 48, 32, 48, 48, 48, 48, 48, 32, 48, 48, 48, 49, 48, 49, 48, 49, 32, 48, 48, 32, 48, 48, 32, 48, 48, 48, 48, 48, 32, 49, 49, 48, 219, 14, 13] The equivalent characters: ["(", "2", "4", "0", ".", "1", " ", "4", "9", ".", "9", " ", "2", "4", "0", ".", "1", " ", "4", "9", ".", "9", " ", "0", "2", "4", "0", " ", "0", "1", "8", "5", " ", "0", "0", "4", " ", "4", "3", "5", " ", "5", "4", ".", "0", "0", " ", "0", "0", "0", " ", "1", "0", "0", " ", "0", "0", "5", "5", " ", "0", "0", "0", "0", " ", "0", "0", "0", ".", "0", " ", "0", "0", ".", "0", "0", " ", "0", "0", "0", "0", "0", " ", "0", "0", "0", "1", "0", "1", "0", "1", " ", "0", "0", " ", "0", "0", " ", "0", "0", "0", "0", "0", " ", "1", "1", "0", "\xDB", "\x0E", "\r"] OR: "(240.1 49.9 240.1 49.9 0240 0185 004 435 54.00 000 100 0055 0000 000.0 00.00 00000 00010101 00 00 00000 110\xDB\x0E\r" Ok I see, my CRC is calculated to "\xDB\r" Which is means the \r must be shifted as you say. Hmm Quote
Gnome Posted November 6, 2016 Author Posted November 6, 2016 Ok, this whole post is due to me being a noob I used a standard implementation of CRC16_XMODEM and didn't shift the 3 problem cases. I really thought I looked into that, but obviously it was late at night or something. I'm using a CRC16_XMODEM implementation that uses a lookup table instead of calculation, hence the screw up eg: https://github.com/postmodern/digest-crc/blob/master/lib/digest/crc16_xmodem.rb After adding those 3 special cases, I've run it again for about 10 000 iteration without a single CRC error. Whereas before it wouldn't take longer than 100 iterations to get an error Quote
Coulomb Posted November 6, 2016 Posted November 6, 2016 No problem, Gnome. This is actually a case where we have source code from the manufacturer. I hear a collective gasp. Well, I imagined one. How was this obtained? One of the users on the AEVA forum asked them by email, and they immediately provided it. Who'd think of that? Though now that I think about it, that was very early on, first or second page of the loong PIP/Axpert topic, when we thought that MPP Solar was the manufacturer. Perhaps they were more helpful than Voltronic would have been, had the user asked thrm instead. In any case, the sample code has the three cases where the CRC character is incremented. If this was a deduced thing, or even a reverse engineered thing, then we might fear that there could be more cases. It's good to hear that the CRCs hold up, i.e. that the slow serial line is reliable, even with a high power switch mode converter in the same metal box. Quote
___ Posted November 6, 2016 Posted November 6, 2016 19 hours ago, The Terrible Triplett said: After having had to do that once too many times, it ALWAYS happened when no-one is around, losing hours of data, we sorted it on all devices. In rare cases now do I not have to unplug - it is annoying with 5 devices. This strange condition only happened twice. Once on my local machine, directly after I used VeConfigure on it. I have Windows running in a virtual machine, so after I used the device on windows I virtually unplugged it and turned the virtual machine off. The device itself remained plugged in on the Linux side and when I used it next, it was in this weird state. I never saw the bug again for MONTHS. Then I saw it once more: That time you tried Blue Lantern on your Pi 1B. Never saw it again after that. Perhaps my software was a bit kinder to it... probably because it's so dead simple :-) Quote
Guest Posted November 7, 2016 Posted November 7, 2016 14 hours ago, plonkster said: Perhaps my software was a bit kinder to it... probably because it's so dead simple :-) :-) Lets think about that for a moment ... Data is read continuously, saved every x seconds. So we look at logs, I at my 7 day logs, and others at Emon, and that is where we see these spikes from data that came in at the exact moment we saved it, every X seconds. So it could happen more than what we think. Saw most errors happening between 8-10pm and again between 12 and 2am, we catered for them all, but the freakishly high values, happens every once in a blue moon. Very annoying if you record the data in a DB. Quote
___ Posted November 7, 2016 Posted November 7, 2016 1 hour ago, The Terrible Triplett said: Data is read continuously, saved every x seconds. Dude, I'll be frank. I think the issue is in your software. I have not seen comparable issues, even when querying the mk2 every 5 seconds for weeks on end. The only issue I saw was that sometimes the mk2 starts up in a strange state (*), but I've never seen it go into that state during normal operation. The funny thing is that when it goes into that state, the CRC is still calculated perfectly over the shorter buffer. Anyway, we're talking Blue on an Axpert thread. Probably shouldn't. Basic agreement though: Sometimes there are bugs in hardware and you have to work around them. * The specific issue is with the CommandGetRAMVarInfo command which is supposed to return two 16-bit numbers, a value and an offset. The sequence is supposed to be "0x8E Lo_value Hi_value 0x8F Lo_offset high_offset", a total of 6 bytes, but sometimes the 0x8F in the middle went AWOL. This is the only issue I know and I've seen it exactly twice. Quote
Guest Posted November 7, 2016 Posted November 7, 2016 The last problem happened on data from a Axpert. First one ever we saw. Off course it can be the code. If we can trap the problem we can fix it. Quote
___ Posted November 7, 2016 Posted November 7, 2016 10 minutes ago, The Terrible Triplett said: The last problem happened on data from a Axpert. First one ever we saw. Oh hang on, I thought you were still having trouble with the mk2 because it was in response to my "unplug and replug" suggestion. My bad! Kom nou daarvan as jy threads hijack né? Quote
Guest Posted November 7, 2016 Posted November 7, 2016 Everything is not about Victron. Reading the MK2 data, compared to reading Axperts / wind controllers / Morningstar / VE.Direct, is a really really sad story. We have it working 99%, not yet 100% accurate. Will get a specialist on it one of these days. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.