When I first heard the word
conflation, my first thought was, errrm is that word made up? But apparantly it's legit, and in the context I heard it the term related to maintaining an up-to-date and coherent view of data for financial securities (such as a stocks or indexes).
The following is a short and direct definition from
Wiktionary:
"A blend or fusion, esp. a composite reading or text formed by combining the material of two or more texts into a single text."
One way to think about conflation is, given a continuous feed of pricing information from an upstream source, it maintains up-to-date and consistent stock price data. This includes numbers such as: bid, ask, last (traded) and (traded) volume.
| VERSION |
BID |
ASK |
LAST |
VOLUME |
| 1 |
20 |
25 |
24.5 |
100000 |
| 2 |
22.5 |
25 |
24.5 |
100000 |
| 3 |
22.5 |
26 |
25 |
200000 |
| 4 |
24 |
26 |
25 |
200000 |
In he above table the bold numbers indicate an updated value, and the green rows indicate a single get from the client application. Conflating fuses all the previous updates into a single consistent set for each requested version, while discarding the historic versions. This kind of thing is also called baselining, or snapshoting.
Conflation has a number of performance considerations:
- How to model updatable price information?
- How to efficiently store all pricing information?
- How to manage concurrent reads and writes of the pricing
store?
To help answer the above questions I've produced a simple test application, if you're really keen you can checkout the source from SVN
here, or browse
here.
The perf app has the following objects:
- Price Object: This needs to store the price information, and should be immutable when accessed by the client app.
- Price Store: Maintains an internal collection of the Price objects, and support updates and gets.
To test a wide range of scenarios the following different combinations
where implemented and tested:
- PriceStore internal collection
- Plain Java HashMap
- Trove Primitve (Long) Map
- PriceStore concurrency
- Unsynchronized Get/Update (Used only as a control test, not a valid scenario)
- Synchronized Get/Update
- Read-Write-Lock Get/Update
- PriceStore Price object management
- Immutable price object store with New-On-Update.
- Mutable price store with Copy-On-Read (get).
The steps implemented in the performance app:
- Pre-populated the PriceStore with all Prices.
- Out in the wild this type of application is always on, and always full.
- Not included in timings.
- Create Writer threads.
- Write all prices to Price Store in bursts with small lag.
- Timed
- Create Read threads.
- Read all prices from Price Store in bursts with small lag.
- Timed
The following parameters were used to setup the app runs:
| Price Count |
1,000,000 |
Number of disctinct prices stored. |
| Burst Count (Burst Size) |
4 (250,000) |
Used to simulate bursts of price updates/gets. |
| Writer Count |
10 |
Number of writer threads. |
| Write Burst Lag |
10 millis |
The delay between each update burst. |
| Per Writer Loops |
3 |
Number of times each writer updates all prices. |
| Reader Count |
100 |
Number of reader threads. |
| Read Burst Lag |
5 millis |
The delay between each get burst. |
| Per Reader Loops |
5 |
Number of times all readers get all prices. |
The results were produced on my MacBook Pro Unibody 15", with an Intel Core 2 Duo 2.8GHz, with 4GB RAM, running OS X 10.5.7. The Java VM used was the latest Apple JDK 1.6.0_13 64-bit. The perf app was run with the following JVM args:
-Xms512M -Xmx512M. Note also the Apple JDK6 only supports the
-server JVM, and no further GC tweaks were made.
The test application was run on two passes: the ORDERED run, where all prices are accessed in sequential order; and the SHUFFLED run, where all prices are accessed in random order.
This chart shows the CPU and Heap usage for the consecutive runs of the different configurations, for the ORDERED access of price objects.
| Name |
Duration
(Seconds) |
Description |
| PLAIN |
47.13 |
Unsychronized, HashMap, New-On-Update, Control Test Only |
| SYNC_MAP |
152.71 |
Synchronized, HashMap, New-On-Update |
| RWLOCK_MAP |
74.60 |
Read/Write Lock, HashMap, New-On-Update |
| SYNC_MAP_ALT |
149.24 |
Synchronized, HashMap, Copy-On-Read |
| RWLOCK_MAP_ALT |
64.36 |
Read/Write Lock, HashMap, Copy-On-Read |
| SYNC_LONGMAP |
157.25 |
Synchronized, Trove Long Map, New-On-Update |
| RWLOCK_LONGMAP |
87.20 |
Read/Write Lock, Trove Long Map, New-On-Update |
| SYNC_LONGMAP_ALT |
179.10 |
Synchronized, Trove Long Map, Copy-On-Read |
| RWLOCK_LONGMAP_ALT |
91.83 |
Read/Write Lock, Trove Long Map, Copy-On-Read |
This chart shows the CPU and Heap usage for the consecutive runs of the different configurations, for the SHUFFLED access of price objects.
| Name |
Duration
(Seconds) |
Description |
| PLAIN |
138.95 |
Unsychronized, HashMap, New-On-Update, Control Test Only |
| SYNC_MAP |
489.97 |
Synchronized, HashMap, New-On-Update |
| RWLOCK_MAP |
174.79 |
Read/Write Lock, HashMap, New-On-Update |
| SYNC_MAP_ALT |
441.71 |
Synchronized, HashMap, Copy-On-Read |
| RWLOCK_MAP_ALT |
159.02 |
Read/Write Lock, HashMap, Copy-On-Read |
| SYNC_LONGMAP |
327.84 |
Synchronized, Trove Long Map, New-On-Update |
| RWLOCK_LONGMAP |
122.76 |
Read/Write Lock, Trove Long Map, New-On-Update |
| SYNC_LONGMAP_ALT |
333.35 |
Synchronized, Trove Long Map, Copy-On-Read |
| RWLOCK_LONGMAP_ALT |
126.38 |
Read/Write Lock, Trove Long Map, Copy-On-Read |
From the results above, we can see that the RWLOCK_MAP_ALT configuration gives the best results for the ORDERED access of elements, and the RWLOCK_LONGMAP_ALT config gives the best results for SHUFFLED access.
The clear winners are using the Copy-On-Read pattern, in favour of New-On-Update immutable object, and the Read/Write Lock in favour of synchronized methods. I expected the Trove Long Map to perform best in both cases, but given the ORDERED access is only an academic excercise, the fact that it performs best on SHUFFLED (most likely real-world usage) is a good thing.
There are some curious patterns in the Heap Usage for both runs. In particular I expected for SYNC_MAP and RWLOCK_MAP heaps to have the same saw-tooth pattern, but RWLOCK_MAP back loads the Heap/GC activity. Usage of the Trove Long Map offers better random access time, but also lower heap utilisation across the board.
Future investigation is to run the test on JDK 1.6.0_14 with the new G1 GC, to see what impact this has. As well as other GC configurations, concurrency patterns, price object designs, distributed data access (Oracle Coherence/Terracotta/Other...).
I hope you found this analysis interesting.