<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tech Presentations &#187; Google FS</title>
	<atom:link href="http://www.techpresentations.com/category/distributed-systems/googlefs/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.techpresentations.com</link>
	<description>Blog about technical presentations, mostly related to web</description>
	<lastBuildDate>Fri, 09 Jan 2009 18:04:00 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9-rare</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Google Internals</title>
		<link>http://www.techpresentations.com/2007/02/10/google-internals/</link>
		<comments>http://www.techpresentations.com/2007/02/10/google-internals/#comments</comments>
		<pubDate>Sat, 10 Feb 2007 07:00:52 +0000</pubDate>
		<dc:creator>Sergey Chernyshev</dc:creator>
				<category><![CDATA[BigTable]]></category>
		<category><![CDATA[Distributed systems]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Google FS]]></category>
		<category><![CDATA[Google Workqueue]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Slides]]></category>

		<guid isPermaLink="false">http://www.techpresentations.com/2007/02/10/google-internals/</guid>
		<description><![CDATA[Presentation about Google&#8217;s internal systems by independent researcher Toby DiPasquale given at Philadelphia LUG on August 2nd, 2006 (slides)

]]></description>
			<content:encoded><![CDATA[<p>Presentation about Google&#8217;s internal systems by independent researcher <a href="http://cbcg.net/">Toby DiPasquale</a> given at <a href="http://www.phillylinux.org/">Philadelphia LUG</a> on August 2<sup>nd</sup>, 2006 (<a href="http://cbcg.net/talks/googleinternals/index.html">slides</a>)</p>
<p><a href="http://cbcg.net/talks/googleinternals/index.html"><img src="http://farm1.static.flickr.com/162/385264373_adb56ebcfb_m.jpg" alt="Google Internals" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpresentations.com/2007/02/10/google-internals/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>BigTable: A Distributed Structured Storage System</title>
		<link>http://www.techpresentations.com/2006/11/20/bigtable-a-distributed-structured-storage-system/</link>
		<comments>http://www.techpresentations.com/2006/11/20/bigtable-a-distributed-structured-storage-system/#comments</comments>
		<pubDate>Mon, 20 Nov 2006 05:16:36 +0000</pubDate>
		<dc:creator>Sergey Chernyshev</dc:creator>
				<category><![CDATA[BigTable]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Google FS]]></category>
		<category><![CDATA[Google Workqueue]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Slides]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://www.techpresentations.com/2006/11/20/bigtable-a-distributed-structured-storage-system/</guid>
		<description><![CDATA[Given by Jeff Dean (Google) at the given University of Washington on Oct 18, 2005 (video, slides)
 
BigTable is a distributed storage system for managing structured data that is designed to scale to a very large size.
Interesting quotes from presentation:

Scale is too big for commercial databases, they can&#8217;t also run on a cheap clustered servers.
Features:

Distributed [...]]]></description>
			<content:encoded><![CDATA[<p>Given by <a href="http://labs.google.com/people/jeff/">Jeff Dean</a> (Google) at the <a href="http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437">given University of Washington</a> on Oct 18, 2005 (<a href="http://video.google.com/videoplay?docid=7278544055668715642">video</a>, <a href="http://lukebaker.org/photos/?tags=google,bigtable&#038;reverse=t&#038;size=-">slides</a>)</p>
<p><embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=7278544055668715642&#038;hl=en" flashvars=""> </embed></p>
<p><a href="http://labs.google.com/papers/bigtable.html">BigTable</a> is a distributed storage system for managing structured data that is designed to scale to a very large size.</p>
<p>Interesting quotes from presentation:</p>
<ul>
<li>Scale is too big for commercial databases, they can&#8217;t also run on a cheap clustered servers.</li>
<li>Features:
<ul>
<li>Distributed multy-level map</li>
<li>Fault tolerant, persistant</li>
<li>Scalabale (thousands of servers, megabytes of in-memory data, petabyte of disk data, millions/sec of r/w, efficient scans)</li>
<li>Self-managing (servers can be added/removed dynamically, servers adjust to load imbalance)</li>
</ul>
</li>
<li>Largest bigtable cells (data collections) ~200TB on over thousands of servers</li>
<li>Built upon:
<ul>
<li><a href="http://en.wikipedia.org/wiki/Google_File_System">GFS</a></li>
<li>Workqueue (scheduler)</li>
<li><a href="http://labs.google.com/papers/chubby.html">Lock service</a></li>
<li><a href="http://en.wikipedia.org/wiki/MapReduce">MapReduce</a></li>
</ul>
</li>
<li>miltidimentional &#8211; row (e.g. url), col (attribute) = cell, inside cell time-based values for the cell.</li>
<li>related rows (tablets) are located on the same machines for better performance</li>
<li>load balancing moves tablets around</li>
<li>tablets are replicated across multiple machines</li>
<li>requests like &#8220;get recent X values&#8221; are possible</li>
<li>columns can be configured to retain only X most recent entries</li>
<li>locality groups to partition tablets</li>
<li>has huge logging problems</li>
<li>a lot of opportunities for compression &#8211; time-shifted data is similar, many values are the same. Using BMDiff (dictionary-based compression) &#8211; encode ~100MB/s, decode ~1000MB/s; Zippy (LZW-like) &#8211; 179MB/s, 409MB/s</li>
<li>Compression experiment results: web pages compress at 9.2%, links at 13.2%, anchors at 12.7%</li>
</ul>
<p><strong>Update</strong>: Luke Baker made <a href="http://lukebaker.org/photos/?tags=google,bigtable&#038;reverse=t&#038;size=-">screen shots</a> from video with all slides (not really in the right order).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpresentations.com/2006/11/20/bigtable-a-distributed-structured-storage-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Building Large Systems at Google</title>
		<link>http://www.techpresentations.com/2006/11/19/building-large-systems-at-google/</link>
		<comments>http://www.techpresentations.com/2006/11/19/building-large-systems-at-google/#comments</comments>
		<pubDate>Mon, 20 Nov 2006 03:14:59 +0000</pubDate>
		<dc:creator>Sergey Chernyshev</dc:creator>
				<category><![CDATA[BigTable]]></category>
		<category><![CDATA[Distributed systems]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Google FS]]></category>
		<category><![CDATA[MapReduce]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://www.techpresentations.com/2006/11/19/building-large-systems-at-google/</guid>
		<description><![CDATA[Google TechTalks presentation by Narayanan Shivakumar, Google Inc. at  (video) on May 31, 2006
 
Interesrting presentation outlining major parts of infrastructure used by Google to run all it&#8217;s projects on.
These are main infrastructure system used at Google:

GFS &#8211; Google file system (1+ PB, 1000+ machines, 1000+ clients, 10 Gb IO)
MapReduce &#8211; distributed computation system [...]]]></description>
			<content:encoded><![CDATA[<p>Google TechTalks presentation by Narayanan Shivakumar, Google Inc. at  (<a href="http://video.google.com/videoplay?docid=-5699448884004201579">video</a>) on May 31, 2006</p>
<p><embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=-5699448884004201579&#038;hl=en"> </embed></p>
<p>Interesrting presentation outlining major parts of infrastructure used by Google to run all it&#8217;s projects on.</p>
<p>These are main infrastructure system used at Google:</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Google_File_System">GFS</a> &#8211; Google file system (1+ PB, 1000+ machines, 1000+ clients, 10 Gb IO)</li>
<li><a href="http://en.wikipedia.org/wiki/MapReduce">MapReduce</a> &#8211; distributed computation system using <a href="http://labs.google.com/papers/mapreduce.html">MapReduce programming model</a></li>
<li><a href="http://en.wikipedia.org/wiki/BigTable">BigTable</a> &#8211; distributed structured storage system</li>
</ul>
<p>P.S. slides are not seen that good unfortunately so you&#8217;ll have to listen patiently.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.techpresentations.com/2006/11/19/building-large-systems-at-google/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
