Post new topic Reply to topic  [ 29 posts ]  Go to page 1, 2  Next
Author Message
User avatar

Joined: Fri Sep 27, 2013 11:37 am
Posts: 400
Post Weekly Dev Meeting - 6th April
Discussion topic for post: http://www.starsonata.com/blog/weekly-d ... 6th-april/


Wed Apr 06, 2016 5:51 pm
Profile
User avatar
Team: Suns of Hades
Rank: Soldier
Main: LemonPrime
Level: 8087

Joined: Wed Sep 29, 2010 10:14 pm
Posts: 5747
Post Re: Weekly Dev Meeting - 6th April
Quote:
We’ve picked a date for our universe reset and you’re getting lots of forewarning on it! The chosen date for the reset is the 7th May. A proper announcement will come nearer the time.


ayyy

_________________
Lemon/Meo


Wed Apr 06, 2016 5:55 pm
Profile
User avatar
Team: Cybernetic Trading Co.
Rank: Councilor
Main: 1-800-USE_THE_FORCE!
Level: 9597

Joined: Sun Aug 28, 2005 6:36 pm
Posts: 2769
Post Re: Weekly Dev Meeting - 6th April
Quote:
No Patch :(

_________________
"I still miss the Crack Whores..." - Jeff_L


Wed Apr 06, 2016 6:02 pm
Profile
User avatar
Team: Eminence Front
Rank: Officer
Main: Chrono Warrior
Level: 5817

Joined: Sat Feb 16, 2013 1:36 pm
Posts: 328
Post Re: Weekly Dev Meeting - 6th April
redalert150 wrote:
Quote:
No Patch :(


- Chrono

_________________
anilv wrote:
#feelthethrm


Wed Apr 06, 2016 6:03 pm
Profile
User avatar
Team: Suns of Hades
Rank: Soldier
Main: LemonPrime
Level: 8087

Joined: Wed Sep 29, 2010 10:14 pm
Posts: 5747
Post Re: Weekly Dev Meeting - 6th April
Chrono Warrior wrote:
redalert150 wrote:
Quote:
No Patch :(


_________________
Lemon/Meo


Wed Apr 06, 2016 6:05 pm
Profile
Main: ShawnMcCall
Level: 2589

Joined: Sat Jan 19, 2013 5:42 am
Posts: 1932
Post Re: Weekly Dev Meeting - 6th April
Chrono Warrior wrote:
redalert150 wrote:
Quote:
No Patch :(


- Chrono


- Shawn


Wed Apr 06, 2016 9:53 pm
Profile
Member
User avatar
Team: Star Revolution X
Rank: Officer
Main: topbuzzz
Level: 8015

Joined: Sun Dec 21, 2008 12:31 pm
Posts: 4347
Post Re: Weekly Dev Meeting - 6th April
Drone looks sexy


Thu Apr 07, 2016 12:36 am
Profile WWW
User avatar
Main: GodSteel
Level: 3170

Joined: Mon Oct 18, 2010 9:23 am
Posts: 272
Post Re: Weekly Dev Meeting - 6th April
What about default camera angle? Iirc someone was to rise this topic on this dev meeting.


Thu Apr 07, 2016 1:56 am
Profile
Content Dev
User avatar
Main: Auxilium
Level: 1

Joined: Thu Mar 24, 2016 10:36 am
Posts: 40
Location: Test Server
Post Re: Weekly Dev Meeting - 6th April
Pretty sure Jey said he's planning to patch on Friday evening / Saturday morning and make a blog post about it before then, but don't quote me on that.


Thu Apr 07, 2016 2:48 am
Profile
Team: Deep Space Federation
Rank: Operator
Main: Brugle
Level: 4111

Joined: Sun Feb 17, 2013 5:24 am
Posts: 362
Location: Norway
Post Re: Weekly Dev Meeting - 6th April
Out of curiosity, what is a memory corruption bug, and how did the one Jey found look like?

I have always thought about it as something to do about; a variable is set to a certain memory size and the programming violates this size?

For example a boolean is given the string 'true' instead of the constant true, and thus assigning more bytes than allowed to the variable?


Thu Apr 07, 2016 3:43 am
Profile
User avatar
Main: GodSteel
Level: 3170

Joined: Mon Oct 18, 2010 9:23 am
Posts: 272
Post Re: Weekly Dev Meeting - 6th April
andsimo wrote:
Out of curiosity, what is a memory corruption bug, and how did the one Jey found look like?

I have always thought about it as something to do about; a variable is set to a certain memory size and the programming violates this size?

For example a boolean is given the string 'true' instead of the constant true, and thus assigning more bytes than allowed to the variable?

Compiler wouldn't allow that. The easiest example would be to have an array of size 5 and access 6th+ item.

Out of curiosity, in a world where most of human knowledge is available to anyone with access to internet, why did you choose to learn from forum of niche game with about 50 active forum users, rather than searching "memory corruption" phrase and checking online encyclopedias?


Thu Apr 07, 2016 4:03 am
Profile
Member
User avatar
Team: Star Revolution X
Rank: Officer
Main: topbuzzz
Level: 8015

Joined: Sun Dec 21, 2008 12:31 pm
Posts: 4347
Post Re: Weekly Dev Meeting - 6th April
cos this forum has Kane


Thu Apr 07, 2016 4:51 am
Profile WWW
Team: Deep Space Federation
Rank: Operator
Main: Brugle
Level: 4111

Joined: Sun Feb 17, 2013 5:24 am
Posts: 362
Location: Norway
Post Re: Weekly Dev Meeting - 6th April
Godsteel wrote:
Out of curiosity, in a world where most of human knowledge is available to anyone with access to internet, ....
How could I then hope to see the memory corruption Jey found?


Thu Apr 07, 2016 5:02 am
Profile
Dev Team
User avatar
Team: Eminence Front
Rank: Officer
Main: Jey123456
Level: 4359

Joined: Fri Sep 24, 2004 11:51 pm
Posts: 3366
Location: who knows ?
Post Re: Weekly Dev Meeting - 6th April
andsimo wrote:
Out of curiosity, what is a memory corruption bug, and how did the one Jey found look like?

I have always thought about it as something to do about; a variable is set to a certain memory size and the programming violates this size?

For example a boolean is given the string 'true' instead of the constant true, and thus assigning more bytes than allowed to the variable?


If it was a simple array going out of bound it would have been much much easier to find and fix XD (you can use a technique generally called memory fencing where you add data after each memory allocation then validate that data to find bound issue).

This specific case was much more obscure, ill be the first to admit i do not understand 100% what is going on in it. But after lots of digging in the memory and gcc docs, i found that it was an issue with a very very rare memory race condition often called l2 cache coherency. Its a problem that can occur on a multi processor system (not multicore since multicore processor share l2 cache between all cores, but multiple processors where you have separate / independent l2 cache).

For a reason that i'm not 100% sure but has likely to do with the inline call combined with some pretty repetitive sequential memory access inside our spaceobjects container (sobmap) the data ended up being kept in l2 cache longer than it should be, the memory writes were done in a different order than written in the code and when a write was done on a cpu even tho it was the same thread, if the thread was moved to another processor (not core but processor), the sobmap rebalanced and then moved back to the original processor, it did not always invalidate the l2 cache in time and ended up reading the memory from before the sobmap rebalance.

When i started suspecting the issue, to help with fixing it / gathering data faster, i setup a virtual machine with 16 processors each having 2 cores (so that i ended up with 16x emulated l2 cache, albit small) and modified the server code to constantly change sobmaps in every galaxies every frame (forcing constant sobmap rebalance) this in turn caused the error to happen within 24hours of the test launch and gave me a huge amount of memory to comb through to figure out exactly what was happening.

Once i finally put my finger on the pattern that was in cause here (the 4 bytes just before the "current" node in the map were always pointing toward a node that was before the current node even tho that pointer should have been the "next" node) i was able to write a scripts to compare the bug with some of the odd 150ish old unresolved server crash dump i have on disk and confirm that the same pattern was found in 91 of those 150 dumps, which was a great sign, it meant that i had finally found a way to reproduce one of the major remaining "odd' crashes, that is, the crashes that when you look at the crash data, it doesnt make any sense / tell you anything as to what caused it.

After that, it was merely a matter of adding a bit of log to track down thread context switch to be sure it was indeed a multiprocessor issue alongside a validation on the data so that it would crash right away when said memory mismatch happened and sure enough, it never happened when the thread was not switched to another context, nor did it ever happen when it was switched to another core on the same processor, but it did happen rarely on processor context changes (less than 1 out of 10000 of the times).

Once i indeed confirmed that the issue was what i suspected i merely added some memory barrier to our sobmap rebalance that enforce all invalidated l2 cache flush (and wait for it to complete) before and after a rebalance, this added a few nanoseconds to the sobmap tree rebalance but made it multiprocessor safe at the same time, another option would have been to set its content to volatile but that would have had a much more serious performance impact since it then never read from cache and always re-query from ram (which is slow as hell in comparison).

This is obviously just an overview of the problem / process to fix it. I skipped a lot of details to try to keep is simple to understand (as simple as that kind of problem can be to understand)

_________________
One of the first and proud flight controller.
Visit our website: http://www.ef-team.com


Thu Apr 07, 2016 6:53 am
Profile
Contributor
User avatar
Team: Eminence Front
Rank:
Main: Dark Steel
Level: 9138

Joined: Tue Jan 24, 2006 10:35 am
Posts: 2068
Location: Netherlands
Post Re: Weekly Dev Meeting - 6th April
Everytime Jey posts one of these I'm reminded why he's a god among men

_________________
~DarkSteel / Auxilium
Image
Image

Universe Map: http://www.starsonata.com/map/


Thu Apr 07, 2016 7:14 am
Profile
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 29 posts ]  Go to page 1, 2  Next


Who is online

Users browsing this forum: No registered users and 15 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
cron
Powered by phpBB © phpBB Group.