The most “horriblest” of bugs I’ve needed to take care of was with a language known as ProIV.
It’s a excessive stage 4GL, and it allowed you to name different applications and move in/out parameters.
If you handed in a literal to a program, it was attainable that if this system you known as tried to move again a price to that literal then it really modified the literal worth!
So after that occurred, the worth of 1 wouldn’t equal ‘1’ any extra, and contours like this:
myint = 6 + 1; if (myint == 7) { print “the reply is 7”; }
Would by no means be true.
This was a nightmare, because it modified the literal for the entire software, not simply the present program, till you logged out and again in.
It was very laborious to search out this bug, and perceive what was occurring, and we spent many hours losing time on this.
The better part, the proudly owning firm really argued that it was a ‘characteristic’ and never a bug.
Yeah, counting on underflow for that looks as if not an ideal thought, no offense.
If it is identified dependable conduct, there is not any cause to not use it. It both works or it does not. I discover it essential as a programmer to maintain a transparent image of how my instruments perform and to make use of them as applicable, deliberately avoiding dogmatic ideas like “in all probability should not depend on this, however do not know why”.
utilizing a signed int and checking for a unfavorable worth would obtain the identical factor, no?
Kinda. That’s what I ended up doing. The cause it would not be my first selection is as a result of I’m evaluating the row index with an unsigned worth, which suggests 1) I’ve to explicitly forged when evaluating to keep away from compiler warnings, and a pair of) it opens me as much as one other class of downside resulting from slicing the optimistic numeric vary in half. In observe, I’m clearly not going to have 2147483648+ rows on this record, nevertheless it’s nonetheless essential to bear in mind.
I’ll should dig into this deeper later at the moment. I’ve a hunch that that is occurring as a result of I’m changing from float to unsigned int, and if I added an intermediate conversion by way of signed int, I’d nonetheless get the underflow. Both outcomes are completely affordable; it is simply attention-grabbing that ARM makes one resolution whereas x86 makes the alternative one.
Only simply found this thread, so time to share my all-time favorite bug from 35 years of programming!
The Winding Number Bug
A pal and I had programmed a multiplayer Bomberman variant on the Amiga. It was principally performed and we have been testing it, however we observed that often the sport would crash. Trying to work out why, we systematically examined each characteristic and nothing crashed. We left the sport working for hours and it did not crash. We performed it for hours… and it repeatedly crashed.
Eventually, I turned conscious of a sample to the crashes. If I positioned a bomb, then ran across the wall tile up and proper of my place, when the bomb exploded the sport would crash. So presumably if I simply ran up and proper one house that might do it? Nope, I needed to run across the tile. Up-down-up-down? Nope. What if I went up-down-up-right-right (to make the gap the identical)? Nope. How about across the block to the left? Yup, crash!
We checked out one another in confusion. The crash clearly relied on the winding quantity across the block. That is: it was measuring whether or not the character’s path wrapped across the wall tile or not. But this was clearly not possible! Nothing within the code had the potential to calculate this, by no means thoughts crash after doing so!
Eventually we tracked down what was happening. The sport used a sort of primitive homebrew object orientation to deal with tile behaviour. The means it labored is that the code variety of a tile was used to search for its behaviours. Due to a typo, one of many frames of the bomb flame animation had the identical object code because the teleporter tile. The crash was being brought on by a personality transferring onto the house above a bomb explosion flame. This was as a result of the opposite finish of the not-really-a-teleporter could not be discovered by the sport. However, the character needed to arrive at precisely the best body for this to occur, as a result of teleportation was solely checked on arrival, not when standing nonetheless.
So why the winding quantity impact? Well, due to the way in which motion within the sport labored, an skilled participant might buffer every transfer a couple of frames forward (whereas the animation for the very last thing was finishing). So, when you dropped a bomb then walked across the adjoining wall tile you would fairly simply do it body completely… and when you did you’d arrive on precisely the best body to set off the not-a-teleporter bug by way of the flames from the bomb you’d dropped. Mystery solved!
I simply ran right into a small gotcha that I feel is value posting. I’ve written a JSON serialization library that may deal with all the primitive varieties you’d count on in C. I simply compiled it on a Raspberry Pi, and one in every of my unit assessments mysteriously failed. The take a look at needed to do with the chosen illustration for very massive integers. Since the JSON spec designates that every one quantity values are saved as double precision floating level, I had some particular code that might detect whether or not a 64-bit integer being serialized could be unrepresentable in double format, and would write it as a string as an alternative of a quantity in these circumstances. On the Raspberry Pi, a quantity that should not have been representable wasn’t getting stringified.
This made me notice that my implementation relies on CPU conduct. I’m simply typecasting from double to uint64_t, then testing equality and utilizing a string illustration if the values are unequal after truncation. On all the different CPUs I’ve compiled this code for, storing UINT64_MAX in a double prompted a truncation, however on this CPU it was precisely representable. I’m undecided if this implies it is not IEEE 754 compliant, or if it simply interprets it in another way. In any case, it is clear that what I have to do is choose a CPU-independent threshold worth and stringify all 64-bit integers above that threshold, as an alternative of counting on typecasting to inform me about representability. Otherwise, a quantity might hypothetically get encoded a technique on one laptop, then when decoded on one other one that may’t signify it, I’d find yourself with a special precise worth.
Discovering edge circumstances like that is all the time enjoyable, so long as they’re fixable. Fortunately, I caught this one earlier than it had the chance to trigger mischief in any real-world knowledge.
Years in the past (someplace within the late nineties) I’ve spent a number of days monitoring a bug that appeared at fully random moments in a C++ code that was used for years and regarded steady. In the top it turned out the code was dangerous from the beginning – I used to be deleting an object after which utilizing a pointer from it to do some cleansing. As lengthy because the environement was single threaded that wasn’t producing issues, as the thing – regardless of being deleted – was nonetheless within the reminiscence.
Changing order of strains was all that was wanted to make it work OK.
In an software the place I can click on and drag the mouse to rotate a 3D digicam, I used to be noticing that each infrequently, beginning a drag would trigger a sudden leap within the digicam’s rotation, far more than it ought to have been from the quantity that I used to be transferring the mouse. Some experimentation revealed that I might make it occur extra usually if I used to be already transferring the mouse earlier than I pressed the button, although nonetheless solely round 10% of the time. This was occurring each on Windows and Linux.
I had a hunch that this had one thing to do with setting mouse delta mode from a mouse down occasion. Since that is for 3D rotation, I need to have the ability to proceed dragging infinitely in a single path with out immediately being unable to maneuver additional because of the mouse cursor hitting a display screen edge, so when a drag gesture begins, I activate a mode that processes mouse occasions in another way and returns movement deltas with out really transferring the cursor. My present implementation of this on the 2 platforms the place the issue happens is a bit of bit janky – I conceal the mouse cursor, warp the pointer location to the middle of the window, then subtract the cursor place from the window heart on each transfer occasion and warp it to the middle once more.
One factor this necessitates is to disregard the massive delta of movement that is registered from the warp itself – in any other case, the massive leap would occur each time mouse delta mode was activated or deactivated, relying on how shut the cursor had been to the window heart. My dealing with for this was to set an express display screen location to disregard from an upcoming mouse occasion, in order that when the movement occasion that was brought on by warping the cursor got here into the occasion queue, I’d simply discard it. This did appear to work more often than not, so why was I nonetheless getting massive deltas each infrequently?
I managed to verify my hunch that the massive deltas have been coming from movement occasions that have been already within the occasion queue earlier than the cursor was warped, so when processing occasions so as, the place I used to be awaiting to discard the occasion did not are available in till after I’d processed one with a special place. Since there’s some variance in timing between when the working system inserts mouse occasions into the occasion queue and when my run loop empties it, it could hardly ever have already inserted a non-delta-mode movement occasion into the queue after the mouse down occasion the place I used to be activating delta mode.
The repair was only a small adjustment to when and the way I discard occasions generated by pointer warping, nevertheless it was a little bit of an journey to get there. It additionally made me notice that there are another points with this method – if I activate delta mode on a window whose heart is offscreen, pointer warping does not work in any respect since I’m attempting to warp to an offscreen location. A small window near the display screen edge will get a lowered vary of movement within the path of the sting, because the invisible cursor will hit it and be stopped. I ought to in all probability discover a extra direct technique to measure mouse deltas than with pointer warping, however possibly I might do a band-aid repair by warping to the middle of the display screen as an alternative of the middle of the window…