<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Top Experts &#187; Blog</title>
	<atom:link href="http://topexperts.co.il/category/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://topexperts.co.il</link>
	<description>Your business our experts</description>
	<lastBuildDate>Thu, 02 Jun 2016 11:19:00 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.1.32</generator>
	<item>
		<title>ORATOP&#8211;Do Less &#8230;.Get More</title>
		<link>http://topexperts.co.il/oratopdo-less-get-more/</link>
		<comments>http://topexperts.co.il/oratopdo-less-get-more/#comments</comments>
		<pubDate>Thu, 02 Jun 2016 11:19:00 +0000</pubDate>
		<dc:creator><![CDATA[Gadi Chen]]></dc:creator>
				<category><![CDATA[Blog]]></category>

		<guid isPermaLink="false">http://topexperts.co.il/?p=338</guid>
		<description><![CDATA[Oratop it a free oracle tool for : •Real-Time Database Monitoring •Real-Time Database performance •Identifying contention and bottlenecks you welcome to download and enjoy My Presentation from Iloug Tech Days3 &#8211; 2016]]></description>
				<content:encoded><![CDATA[<p>Oratop it a free oracle tool for :</p>
<ol>
<ol>
<li>•Real-Time Database Monitoring</li>
<li>•Real-Time Database performance</li>
<li>•Identifying contention and bottlenecks</li>
<li></li>
</ol>
</ol>
<p>you welcome to download and enjoy <a title="http://topexperts.co.il/GadiChen-TechDays2016.pdf" href="http://topexperts.co.il/GadiChen-TechDays2016.pdf">My Presentation</a> from <a href="https://iloug-2016.events.co.il/home">Iloug Tech Days3 &#8211; 2016</a></p>
]]></content:encoded>
			<wfw:commentRss>http://topexperts.co.il/oratopdo-less-get-more/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MySql Supported Storage Engines</title>
		<link>http://topexperts.co.il/mysql-supported-storage-engines/</link>
		<comments>http://topexperts.co.il/mysql-supported-storage-engines/#comments</comments>
		<pubDate>Thu, 28 Jan 2016 14:03:32 +0000</pubDate>
		<dc:creator><![CDATA[Gadi Chen]]></dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[MySql]]></category>

		<guid isPermaLink="false">http://topexperts.co.il/?p=335</guid>
		<description><![CDATA[· InnoDB: The default storage engine as of MySQL 5.5.5. InnoDB is a transaction-safe (ACID compliant) storage engine for MySQL that has commit, rollback, and crash-recovery capabilities to protect user data. InnoDB row-level locking (without escalation to coarser granularity locks) and Oracle-style consistent nonlocking reads increase multi-user concurrency and performance. InnoDB stores user data in&#8230;<p><a class="more-link" href="http://topexperts.co.il/mysql-supported-storage-engines/" title="Continue reading &#8216;MySql Supported Storage Engines&#8217;">Continue reading <span class="meta-nav">&#8594;</span></a></p>]]></description>
				<content:encoded><![CDATA[<ul>
<li>· <a href="https://dev.mysql.com/doc/refman/5.5/en/innodb-storage-engine.html"><code>InnoDB</code></a>: The default storage engine as of MySQL 5.5.5. <code>InnoDB</code> is a transaction-safe (ACID compliant) storage engine for MySQL that has commit, rollback, and crash-recovery capabilities to protect user data. <code>InnoDB</code> row-level locking (without escalation to coarser granularity locks) and Oracle-style consistent nonlocking reads increase multi-user concurrency and performance. <code>InnoDB</code> stores user data in clustered indexes to reduce I/O for common queries based on primary keys. To maintain data integrity, <code>InnoDB</code> also supports <code>FOREIGN KEY</code> referential-integrity constraints</li>
<li>· <a href="https://dev.mysql.com/doc/refman/5.5/en/myisam-storage-engine.html"><code>MyISAM</code></a>: The MySQL storage engine that is used the most in Web, data warehousing, and other application environments. <code>MyISAM</code> is supported in all MySQL configurations, and is the default storage engine prior to MySQL 5.5.5.</li>
<li>· <a href="https://dev.mysql.com/doc/refman/5.5/en/memory-storage-engine.html"><code>Memory</code></a>: Stores all data in RAM for extremely fast access in environments that require quick lookups of reference and other like data. This engine was formerly known as the <code>HEAP</code> engine.</li>
<li>· <a href="https://dev.mysql.com/doc/refman/5.5/en/merge-storage-engine.html"><code>Merge</code></a>: Enables a MySQL DBA or developer to logically group a series of identical <code>MyISAM</code> tables and reference them as one object. Good for VLDB environments such as data warehousing.</li>
<li>· <a href="https://dev.mysql.com/doc/refman/5.5/en/archive-storage-engine.html"><code>Archive</code></a>: Provides the perfect solution for storing and retrieving large amounts of seldom-referenced historical, archived, or security audit information.</li>
<li>· <a href="https://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html"><code>Federated</code></a>: Offers the ability to link separate MySQL servers to create one logical database from many physical servers. Very good for distributed or data mart environments.</li>
<li>· <a href="https://dev.mysql.com/doc/refman/5.5/en/mysql-cluster.html"><code>NDB</code></a> (also known as <a href="https://dev.mysql.com/doc/refman/5.5/en/mysql-cluster.html"><code>NDBCLUSTER</code></a>)—This clustered database engine is particularly suited for applications that require the highest possible degree of uptime and availability.</li>
</ul>
<p><u><strong></strong></u>&nbsp;
<p><u><strong>Storage Engines Feature Summary</strong></u></p>
<p><a href="http://topexperts.co.il/wp-content/uploads/2016/01/image.png"><img width="552" height="449" title="image" style="border: 0px currentcolor; padding-top: 0px; padding-right: 0px; padding-left: 0px; margin-right: auto; margin-left: auto; float: none; display: block; background-image: none;" alt="image" src="http://topexperts.co.il/wp-content/uploads/2016/01/image_thumb.png" border="0"></a></p>
<p>&nbsp;
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297501104"><sup>[a] </sup></a>InnoDB support for geospatial indexing is available in MySQL 5.7.5 and higher.</p>
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297493232"><sup>[b] </sup></a>InnoDB utilizes hash indexes internally for its Adaptive Hash Index feature.</p>
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297490272"><sup>[c] </sup></a>InnoDB support for FULLTEXT indexes is available in MySQL 5.6.4 and higher.</p>
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297480656"><sup>[d] </sup></a>Compressed MyISAM tables are supported only when using the compressed row format. Tables using the compressed row format with MyISAM are read only.</p>
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297479344"><sup>[e] </sup></a>Compressed InnoDB tables require the InnoDB Barracuda file format.</p>
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297477568"><sup>[f] </sup></a>Implemented in the server (via encryption functions), rather than in the storage engine.</p>
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297472128"><sup>[g] </sup></a>Implemented in the server, rather than in the storage engine.</p>
<p><a href="https://dev.mysql.com/doc/refman/5.5/en/storage-engines.html#idm139924297466720"><sup>[h] </sup></a>Implemented in the server, rather than in the storage engine.</p>
]]></content:encoded>
			<wfw:commentRss>http://topexperts.co.il/mysql-supported-storage-engines/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>kdb+ and MongoDB case study</title>
		<link>http://topexperts.co.il/kdb-and-mongodb-case-study/</link>
		<comments>http://topexperts.co.il/kdb-and-mongodb-case-study/#comments</comments>
		<pubDate>Thu, 19 Mar 2015 19:28:57 +0000</pubDate>
		<dc:creator><![CDATA[Gadi Chen]]></dc:creator>
				<category><![CDATA[Blog]]></category>

		<guid isPermaLink="false">http://topexperts.co.il/?p=320</guid>
		<description><![CDATA[This a post written by friend Jamie Grant from AquqQ&#160; Enjoy : Big Data technologies have been all over technology press for what seems like a long time now. MongoDB is an established player in the space, one of the first examples of what have come to be called ‘NOSQL’ databases. With its schema-less design&#8230;<p><a class="more-link" href="http://topexperts.co.il/kdb-and-mongodb-case-study/" title="Continue reading &#8216;kdb+ and MongoDB case study&#8217;">Continue reading <span class="meta-nav">&#8594;</span></a></p>]]></description>
				<content:encoded><![CDATA[<p><strong>This a post written by friend Jamie Grant from AquqQ&nbsp; Enjoy :</strong>
<p>Big Data technologies have been all over technology press for what seems like a long time now. MongoDB is an established player in the space, one of the first examples of what have come to be called ‘NOSQL’ databases. With its schema-less design MongoDB is well suited to storing unstructured data and it promises many features to make it easy to scale across multiple servers. MongoDB version 2.6 also introduced support for full text indexing and search.
<p>With our traditional expertise being with kdb+, we decided to look at ways the two (kdb+ and MongoDB) technologies might complement each other. kdb+ excels with time series data analysis with structured data, while MongoDB offers text indexing and search functionality that doesn’t yet have an equivalent in kdb+
<p>For this case study, we’re using MongoDB as the text storage for a kdb+ database – offloading the storage of large text documents from the main kdb+ server, taking advantage of Mongo DB text indexing for searching, while retaining the speed of kdb+ for searches and reporting on the structured data.<br />
<h6><font face="Aharoni"><font style="font-weight: bold" size="5"><u>mongoq</u></font></font></h6>
<p>mongoq is a set of libraries built on top of the <a href="https://github.com/mongodb/mongo-c-driver">mongo c driver</a>, to allow access to a MongoDB database direct from q. The C functions take json strings as parameters, which are converted to <a href="http://bsonspec.org/">bson</a>documents by <a href="https://github.com/mongodb/libbson">libbson</a> prior to being passed to MongoDB. Result sets are converted from bson to json and returned from C to q as a list of json strings.
<p>Some wrapper q functions are provided which take dictionary parameters, handling json encoding and decoding using j.k
<p>Source code, documentation and examples can be found on our github site <a href="https://github.com/AquaQAnalytics/mongoq">here</a>. The interface is in POC/alpha state at the moment – mongo c driver API coverage isn’t complete yet, and there are performance improvements and better error handing planned in the near future.<br />
<h4><font style="font-weight: bold" size="6" face="Aharoni"><u>Reddit Database</u></font></h4>
<p>For a dataset for our case study, we found 1Gb of reddit comments on <a href="http://www.reddit.com/r/datasets/comments/26ijdl/15_million_reddit_comments_from_november_sql_and/">this</a> thread.<br />
<h6><font size="5" face="Aharoni"><font style="font-weight: bold">Loading</font></font></h6>
<p>We used a q script to parse the data, maintaining the meta data in a kdb+ splayed table, with any long strings (comment text, topic, comment html) being pushed to MongoDB, referenced with a MongoDB id in the kdb+ table.
<p>The loader script outputs the following, including a breakdown of the timings:
<pre><code><font style="background-color: #cccccc"><strong>&gt; q examples/comments_load.q mydb comments.all.sql
...
Processing - 1020 of 1048 MB : 97.4%
Processing - 1029 of 1048 MB : 98.3%
Processing - 1039 of 1048 MB : 99.2%
Processing - 1047 of 1048 MB : 100%
#####################################
# reading    | 0D00:00:00.743248000 #
# parsing    | 0D00:02:31.920447000 #
# mongoinsert| 0D00:03:13.843763000 #
# kdbwrite   | 0D00:01:40.621209000 #
# mongoindex | 0D00:02:31.501243000 #
# TOTAL      | 0D00:09:58.629910000 #
#####################################</strong></font></code></pre>
<p>Free text data always presents a parsing challenge, and this data has been imported to a sql database, then exported to text again, so the escaping of special characters is inconsistent. We’ve chosen to process the .sql file instead of the .csv as it’s easier to tell apart genuine record breaks from newlines inside comments. We aren’t going to have 100% success rate at processing every post into the correct table structure, but the script should recover within a record or two if it’s thrown off by the content of a post.</p>
<p>The following code is where the loader script writes to the two databases, MongoDB via the mongoq library, and kdb+ appending to a splayed table on disk
<pre><code><font style="background-color: #cccccc"><strong>processdata:{[t]
  mgcols:`link_title`subreddit`body`body_html`author_flair_text;
  oid:.mg.add[`comments;mgcols#t];
  (` sv db,`comments`) upsert .Q.en[db] (mgcols _ t),'([]mgid:oid);
 };</strong></font></code></pre>
<p><font style="background-color: #ffffff" face="Courier New">The .mg.add function takes a table or list of dictionaries as a parameter, and returns a vector of MongoDB _id values as a 16 byte GUID type. </font></p>
<p><font style="background-color: #ffffff" face="Courier New">This id can be used for retrieval:</font></p>
<p><font face="Courier New"></font>&nbsp;
<pre><code><font style="background-color: #cccccc"><strong>q)id:.mg.add[`test;([]time:12 13t;sym:`xx`yyy;price:11.2 34.5)]
q)id
00000000-54f6-4410-dfa8-3258b51290e8 00000000-54f6-4410-dfa8-3258b51290e9
q).mg.find[`test;id;()] / all fields for these ids in 'test' collection
time           sym   price
--------------------------
"12:00:00.000" "xx"  11.2
"13:00:00.000" "yyy" 34.5</strong></font></code></pre>
<h6><font size="5" face="Aharoni"><font style="font-weight: bold">Text Index</font></font></h6>
<p>MongoDB has a <a href="http://docs.mongodb.org/manual/core/index-text/">text indexing</a> feature which supports stemming and relevance scoring. We’re going to use this feature as it provides functionality not readily available in q. The mongoq interface includes an ‘addindex’ function which accepts index parameters as a json string. To create a text index on the ‘body’ field in the comments records we’re storing, we do the following at the end of the loader script:
<pre><code><font style="background-color: #cccccc"><strong>.mg.addindex[`comments;.j.j enlist[`body]!enlist `text]</strong></font></code></pre>
<p>To index across all text fields (`$”**”) can be used in place of the field name.</p>
<h6><font style="font-weight: bold" size="5" face="Aharoni">Queries</font></h6>
<p>The following are some example queries and their execution time on our machine. Queries which ship large amounts of data to or from from Mongo will be slowed down by the extra step in the bson&lt;-&gt;json&lt;-&gt;kobject conversion – this would ideally be replaced with a direct bson&lt;-&gt;kobject conversion library.</p>
<h6><font size="5" face="Aharoni"><font style="font-weight: bold">TOP 10 TOPICS FOR SEARCH TERM</font></font></h6>
<pre><code><font style="background-color: #cccccc"><strong>q)topics:{[term] 10 sublist select score, subreddit, title:40 sublist' link_title from `score xdesc 0!select max score, first subreddit by link_title from .mg.search[`comments;term]}
q)\t topics "IBM"
63
q)topics "IBM"
score    subreddit          title
----------------------------------------------------------------------
0.963724 "todayilearned"    "TIL that Bill Gates told the creators of"
0.904494 "linux"            "Anybody know a bit about older keyboards"
0.803571 "starcraft"        "Why, and how much, do Blizzard invest in"
0.779412 "gamecollecting"   "Looking for a SEGA Saturn Keyboard."
0.765306 "Bitcoin"          "Seeking Alpha''s AJ Watkinson Predicts a"
0.761811 "buildapc"         "What is the major differance between Int"
0.760274 "linux"            "Linux Desktop''s Missed Opportunities"
0.75     "learnprogramming" "Is anyone here familiar with Assembly? I"
0.666667 "apple"            "Apple''s iPhone, iPad used to place over"
0.583333 "Games"            "FIFA Manager series cancelled"</strong></font></code></pre>
<h6><font size="5" face="Aharoni">THREADS AND COMMENTS AGGREGATED BY SUBFORUM FOR SEARCH TERM</font></h6>
<pre><code><font style="background-color: #cccccc" face="Aharoni">q)threads
{[term]
  t:comments comments[`mgid] bin m:.mg.searchid[`comments;term];
  t:(`$.mg.find[`comments;m;`subreddit]),'t;
  r:`threads xdesc select threads:count distinct link_id, comments:count i by subreddit from t;
  r}
q)threads "\"ivy bridge\"" / search exact term
subreddit      | threads comments
---------------| ----------------
buildapc       | 9       9
gaming         | 2       3
pcmasterrace   | 2       2
AskReddit      | 1       1
WildStar       | 1       1
battlestations | 1       1	
buildapcforme  | 1       1
buildapcsales  | 1       1
dayz           | 1       1
hardware       | 1       1
linux          | 1       1
mcservers      | 1       1
programming    | 1       1
techsupportgore| 1       1</font></code></pre>
<h6><font size="5" face="Aharoni">Conclusions</font></h6>
<ul>
<li>MongoDB presents an interesting proposition as a text store backing to a kdb+ database, opening up search features that wouldn’t be easily replicated in a kdb+ only database
<li>Performance of the adaptor is acceptable as long as the amount of data passed between MongoDB and kdb+ is minimised. Future versions of the adaptor will skip the expensive json translation step with a direct bson&lt;-&gt;kobject translation library
<li>All the adaptor code used for this post is available as open source on our <a href="https://github.com/AquaQAnalytics/mongoq">github</a> site. It should be considered in alpha state – report bugs and feature requests through github</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://topexperts.co.il/kdb-and-mongodb-case-study/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
