We all know that IN-Memory is a great new feature, but how much work is done on behalf to maintain the in-memory column store ? You may be surprised !
In this post i will trace the work done by the In-memory background worker processes W0nn during population and trickle re-population .I will be using systemtap and some scripts developed by Luca Canali.So let’s begin !
Test database 126.96.36.199.6 (Linux)
The script used are a mix of :
I will use the sample created in my last blog post .
Test of In-memory population :
The table ‘TEST_COST1’ is not already loaded in-memory.Let’s request it’s population and take a look at the different systemtap trace file:
select count(*) from test_cost1;
So now the table is loaded in two different IMCUs.Let’s now check the collected systemtap trace file (Contain wait event begin/end ,async I/O syscall and trace of consistent read kcbgtcr ) :
As Frits Hoogland stated in his blog post the table is read using Direct path read (we can also observe that it begin by using only one I/O slot at a time and not the default which is 2 for adaptive direct path reads but this is for another blog post for more info on auto I/O slot resize please check this blog post)
Let’s do some mining on the collected trace file and observe :
There is 31975 consistent block reads from TEST_COST1 with contain 12241 blocks !
We have read 254MB for loading a table of approximately 96MB;
The in-memory store contain 7475+4662 = 12137 blocks
12137+ 1 header block (which is not loaded in the column store with the space management block ) = 12138 blocks so all the segment is read but how many times ?
Wow ! for an unknown reason the segments is read 2 times into the PGA using direct path reads and some parts of the segment are read 3 times (the segment header is read multiple times) ! so this induce more physical reads,repeated block clean out etc !
So watch out as you may see a huge increase of the load on the server during in-memory population !
That’s not all it’s now time for :
Test of In-memory re-population :
Let’s begin by checking how the IMCUs are mapped to the datafile extents.We can do that using Franck pachot script
I will use this output to check in which IMCUs each block/extents is populated.
Let’s now update a row in the second IMCU and see what happen :
After some times trickle repopulation kicks in :
So what happened ?
Part of the segment is read using direct path read mode but how much data ?
Wow ! 9435 consistent reads and 74MB reads caused by one stale rows ! The segment header was read 111 times from the buffer cache ((9435 – 111)*8KB = 74592KB correspond to the volume of data read ) and all the extents mapped to the second IMCU are read two times using direct path read.We can verify that using the mapping of the extents to the IMCU and also we have read 4663 different block and the second IMCU contain 4662 blocks (minus the segment header). This seems a lot of work for cleaning stale entry (hopefully this does not happen when a session is accessing a stale row from the in-memory column store more info here)
Let’s now update two rows in two different blocks in two different IMCU and see what happen :
After some times trickle repopulation kicks in. How much data ?
The two IMCU are processed by two different in-memory background process (W0nn) .Each background process have read all the extent mapped to it’s corresponding IMCU two times (reading the segment header multiple times). Woow ! this seems a lot of work for two staled rows ! The full segment was read two times for cleaning two staled rows !
Why not only reading the blocks containing the stale rows using the rowid ? Why reading two times all the extents that are mapped to the IMCU which contain the stale rows ? einh ?
That’s it 😀