SlideShare a Scribd company logo
1 of 97
Download to read offline
Sharding for the masses
                    Introducing the Spider
                  storage engine, and more



                                 Giuseppe Maxia
                                 Kentoku Shiba

                         This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License.

Tuesday, 13 April 2010
What is sharding?



                         "shard" is a piece of broken ceramic or glass

                         "Sharding" means breaking a database to pieces




Tuesday, 13 April 2010
WHY SHARDING?




Tuesday, 13 April 2010
WHY SHARDING?
                         SCALING

                         SCALING

                         SCALING
                         SCALING
                          SCALING
Tuesday, 13 April 2010
Scaling: the problem


                         You start with one server

                           Too much data

                           Too much traffic

                         Now what?




Tuesday, 13 April 2010
Scaling: the solution


                         The MySQL way

                         Also known as the Yahoo and Google way

                         Replication




Tuesday, 13 April 2010
Replication: how it works
                                         Master
                         WRITE

            client                                slave
                                 slave



                 READ
                            load
                          balancer        slave


Tuesday, 13 April 2010
Replication: how it scales
                                                 Master
                          WRITE

            client                    slave
                                       slave
             client                     slave                slave
                                                              slave
                                         slave                 slave
                                                                slave
              client
               client
                client
                 client


           READ              load
                           balancer               slave
                                                   slave
                                                    slave
                                                     slave

Tuesday, 13 April 2010
Replication: how it chokes
                                                       Master
                                                                   > data
                         WRITE


            client                          slave
                                             slave
             client                           slave                slave
                                                                    slave
                                               slave                 slave
                                                                      slave
              client
               client
                client
                 client


           READ                    load
                                 balancer               slave
                                                         slave
                                                          slave
                                                           slave

Tuesday, 13 April 2010
Homemade sharding
        client
                                        Master1
                                                  rule1
         WRITE
         READ                           Master2
                                                  rule2
                          application
                             logic

                                        MasterN
                                         MasterN
                                          MasterN
                                           MasterN
                                                  ruleN
                                                   ruleN
                                                    ruleN
                                                     ruleN

Tuesday, 13 April 2010
How homemade sharding
                         breaks
        client
                                       Master1
                                                 rule1
         WRITE
         READ                          Master2
                                                 rule2
                         application
                            logic

                                       MasterN
                                        MasterN
                                         MasterN
                                          MasterN
                                                 ruleN
                                                  ruleN
                                                   ruleN
                                                    ruleN

Tuesday, 13 April 2010
How homemade sharding
                         breaks
        client
                                       Master1
                                                 rule1
         WRITE
         READ                          Master2
                                                 rule2
                         application
                            logic

                                       MasterN
                                        MasterN
                                         MasterN
                                          MasterN
                                                 ruleN
                                                  ruleN
                                                   ruleN
                                                    ruleN

Tuesday, 13 April 2010
How homemade sharding
                         breaks
        client



         WRITE
         READ
                         application
                            logic

                                            Master1
                                           Master2
                                       MasterN
                                        MasterN      rule1
                                         MasterN rule2
                                          MasterN
                                                 ruleN
                                                  ruleN
                                                   ruleN
                                                    ruleN
Tuesday, 13 April 2010
How homemade sharding
                        breaks (2)
        client
                                       Master1
                                                  rule1
                                                   rule1
                                                    rule1
                                                  rule1
                                                 rule1
         WRITE
         READ                          Master2
                                                 rule2
                                                  rule2
                         application               rule2
                                                  rule2
                                                 rule2
                            logic

                                       MasterN
                                        MasterN
                                         MasterN
                                          MasterN
                                                 ruleN
                                                ruleN
                                                  ruleN
                                               ruleN
                                              ruleNruleN
                                                    ruleN
                                              ruleN

Tuesday, 13 April 2010
How homemade sharding
                        breaks (2)
        client
                                             Master1
                                                        rule1
                                                         rule1
                                                          rule1
                                                        rule1
                                                       rule1
     WRITE
     READ                                    Master2
                                                       rule2
                                                        rule2
                                                         rule2
                                                        rule2
                                                       rule2
                         application logic

                                             MasterN
                                              MasterN
                                               MasterN
                                                MasterN
                                                       ruleN
                                                      ruleN
                                                        ruleN
                                                     ruleN
                                                    ruleNruleN
                                                          ruleN
                                                    ruleN

Tuesday, 13 April 2010
The quest for magic
                              sharding




Tuesday, 13 April 2010
The quest for magic
                                sharding

                         MySQL Proxy

                           HSCALE

                           SpockProxy




Tuesday, 13 April 2010
The quest for magic
                                sharding

                         MySQL Proxy

                           HSCALE

                           SpockProxy

                         DON'T SCALE (SPoF)




Tuesday, 13 April 2010
Horizontal
                   partitioning
Tuesday, 13 April 2010
Introducing Spider
                         A MySQL storage engine

                         Developed by Kentoku Shiba

                         Built on top of the partitions engine

                         Associates a partition with a remote server

                         Transparent to user

                         Easy to expand

                         Independent from application
Tuesday, 13 April 2010
Note about partitions
                         A feature introduced in MySQL 5.1

                         Horizontal partitioning

                         Transparent to users

                         Increases insertion and selection performance

                         presentations

                            http://tinyurl.com/mysql-partition-tut

                            http://tinyurl.com/mysql-partition-perf

Tuesday, 13 April 2010
host2   MySQL server
                                              without
                                              SPIDER
  Spider conceptual model
                                    host3   MySQL server
                                              without
                                              SPIDER

         table employees            host4   MySQL server
 partition 1 year 1997 host2                  without
                                              SPIDER
 partition 2 year 1998 host3
 partition 3 year 1999 host4        host5   MySQL server
                                              without
 partition 4 year 2000 host5
                                              SPIDER
 partition 5 year 2001 host6
 partition 6 year 2002 host7        host6   MySQL server
                                              without
                                              SPIDER
  host1              MySQL server
                      with SPIDER   host7   MySQL server
                                              without
                                              SPIDER


Tuesday, 13 April 2010
host2   MySQL server
                                                          without
                                                          SPIDER
  Spider conceptual model
                                                host3   MySQL server
                                                          without
                                                          SPIDER

         table employees                        host4   MySQL server
 partition 1 year 1997 host2                              without
                                                          SPIDER
 partition 2 year 1998 host3
 partition 3 year 1999 host4                    host5   MySQL server
                                                          without
 partition 4 year 2000 host5
                                                          SPIDER
 partition 5 year 2001 host6
 partition 6 year 2002 host7                    host6   MySQL server
                                                          without
                                                          SPIDER
  host1              MySQL server
                      with SPIDER               host7   MySQL server
                                        no                without
                                                          SPIDER
                                    data here

Tuesday, 13 April 2010
select * from
                            employees where
                          date = '1998-01-01'   host3   MySQL server
                                                          without
                                                          SPIDER

         table employees                        host4   MySQL server
 partition 1 year 1997 host2                              without
                                                          SPIDER
 partition 2 year 1998 host3
 partition 3 year 1999 host4                    host5   MySQL server
                                                          without
 partition 4 year 2000 host5
                                                          SPIDER
 partition 5 year 2001 host6
 partition 6 year 2002 host7                    host6   MySQL server
                                                          without
                                                          SPIDER
  host1              MySQL server
                      with SPIDER               host7   MySQL server
                                                          without
                                                          SPIDER


Tuesday, 13 April 2010
host2   MySQL server
                                                          without
                             select * from                SPIDER
                            employees where
                          date = '1998-01-01'   host3   MySQL server
                                                          without
                                                          SPIDER

         table employees                        host4   MySQL server
 partition 1 year 1997 host2                              without
                                                          SPIDER
 partition 2 year 1998 host3
 partition 3 year 1999 host4                    host5   MySQL server
                                                          without
 partition 4 year 2000 host5
                                                          SPIDER
 partition 5 year 2001 host6
 partition 6 year 2002 host7                    host6   MySQL server
                                                          without
                                                          SPIDER
  host1              MySQL server
                      with SPIDER               host7   MySQL server
                                                          without
                                                          SPIDER


Tuesday, 13 April 2010
host2   MySQL server
                                              without
                                              SPIDER

                                    host3   MySQL server
                                              without
                                              SPIDER

         table employees            host4   MySQL server
 partition 1 year 1997 host2                  without
                                              SPIDER
 partition 2 year 1998 host3
 partition 3 year 1999 host4        host5   MySQL server
                                              without
 partition 4 year 2000 host5
                                              SPIDER
 partition 5 year 2001 host6
 partition 6 year 2002 host7        host6   MySQL server
                                              without
                                              SPIDER
  host1              MySQL server
                      with SPIDER   host7   MySQL server
                                              without
                                              SPIDER


Tuesday, 13 April 2010
select * from        host2   MySQL server
                                                     without
                 employees where date =              SPIDER
                  '1998-01-01' limit 0,
                     9223372036854775807
                                           host3   MySQL server
                                                     without
                                                     SPIDER

         table employees                   host4   MySQL server
 partition 1 year 1997 host2                         without
                                                     SPIDER
 partition 2 year 1998 host3
 partition 3 year 1999 host4               host5   MySQL server
                                                     without
 partition 4 year 2000 host5
                                                     SPIDER
 partition 5 year 2001 host6
 partition 6 year 2002 host7               host6   MySQL server
                                                     without
                                                     SPIDER
  host1              MySQL server
                      with SPIDER          host7   MySQL server
                                                     without
                                                     SPIDER


Tuesday, 13 April 2010
host2   MySQL server
                                              without
                                              SPIDER

                                    host3   MySQL server
                                              without
                                              SPIDER

         table employees            host4   MySQL server
 partition 1 year 1997 host2                  without
                                              SPIDER
 partition 2 year 1998 host3
 partition 3 year 1999 host4        host5   MySQL server
                                              without
 partition 4 year 2000 host5
                                              SPIDER
 partition 5 year 2001 host6
 partition 6 year 2002 host7        host6   MySQL server
                                              without
                                              SPIDER
  host1              MySQL server
                      with SPIDER   host7   MySQL server
                                              without
                                              SPIDER


Tuesday, 13 April 2010
host8               MySQL server with SPIDER   host2   MySQL server
                                                         without
                                                         SPIDER
          table employees
  partition 1 year 1997 host2                  host3   MySQL server
  partition 2 year 1998 host3                            without
                                                         SPIDER
  partition 3 year 1999 host4
  partition 4 year 2000 host5                  host4   MySQL server
  partition 5 year 2001 host6                            without
                                                         SPIDER
  partition 6 year 2002 host7
                                               host5   MySQL server
host1               MySQL server with SPIDER
                                                         without
                                                         SPIDER

          table employees                      host6   MySQL server
  partition 1 year 1997 host2                            without
                                                         SPIDER
  partition 2 year 1998 host3
  partition 3 year 1999 host4                  host7   MySQL server
  partition 4 year 2000 host5                            without
                                                         SPIDER
  partition 5 year 2001 host6
  partition 6 year 2002 host7
Tuesday, 13 April 2010
data            data   data   data   data




Tuesday, 13 April 2010
data            data   data   data   data




Tuesday, 13 April 2010
data            data   data   data   data




Tuesday, 13 April 2010
data            data   data   data   data




Tuesday, 13 April 2010
data            data   data   data   data




Tuesday, 13 April 2010
data            data   data     data   data




                                client

Tuesday, 13 April 2010
data            data   data        data   data




                                client
                                 client
                                  client
                                   client
Tuesday, 13 April 2010
data       data   data   data   data
                 HA      HA     HA     HA     HA

              data       data   data   data   data




Tuesday, 13 April 2010
INSTALLATION




Tuesday, 13 April 2010
INSTALLATION (1)

                         Get the source code for MySQL 5.1.39

                           http://dev.mysql.com/downloads

                         Get the source code for Spider 2.5

                           http://launchpad.net/spiderformysql

                         Get the patch for condition pushdown

                           https://launchpad.net/
                           partitionconditionpushdownformysql

Tuesday, 13 April 2010
INSTALLATION (2)
                         Unpack MySQL source code

                         Unpack Spider source code and docs

                         Unpack the condition pushdown patch

       mkdir spider
       cd spider
       tar -xzf mysql-5.1.39.tar.gz
       tar -xzf spider-src-1.0-for-5.1.39.tgz
       tar -xzf spider-doc-1.0-for-5.1.39.tgz
       tar -xzf partition_cond_push-0.1-
       for-5.1.36.tgz
Tuesday, 13 April 2010
/   l
            .'      :                    __.....__..._ ____
           / /                    _.-"              "-. ""            "-.
         (`-: .---:         .--.'                  _....J.                 "-.
           """y        ,.'        __..--""                 `+""--.           `.
             :        .'/      .-"""-. _.                       `.      "-.       `._.._
             ; _.'.' .-j                 `.                                "-.     "-._`.
             :      / .-" :                  `-.                    `-         "-.        
               ; /.'        ;               :;                         ."                   `,
               :_:/         ::             ;:         (             /     .-"     .')        ;




                                                                                                  STOP!
                  ;-"         ; "-.       / ;                     .^. .'        .' /     .-"
                /       .- :         `. '. : .- / __.-j.'.'                 .-" /.---'
               / /          `,. .'         "":' /-"           .'           __.'
             : :               ,""            ; .'         .'          .-""
            _J ;               ; `.          /.'         _/       .-"
           / "-:              /"--.b-..-'            .'            ;
         /        / ""-..'                     .--'.-'/ , :
        :`.     :       / :                    `-i" ,',_: _ 
        :  '._ :__;                           .'.-"; ; ; j `.l
                          "-._               `" :_/ :_/
           `.;                   "-._



                                                                          What the hell is a
             :_"-._                    "-.
                `. l "-.           )        `.
                   ""^--""^-. :                
                                ";               
                                :                  `._




                                                                             "condition
                                ; /        `._        ""---.
                               / /     _           `.--.__.'
                              : :     / ; :". 
                              ; ; : : ; `. `.
                            / ; :         ; :        `. `.
                           / /: ;         : ;            "-'



                                                                            pushdown"?
                          :_.' ; ;          ; :
                               / /          :_l
                               `-'




Tuesday, 13 April 2010
MASTER

                                                     remote
                  SELECT * FROM sometable            server
                      WHERE col1 = 2               management




                                            SELECT col1,col2,col3
                                            FROM sometable LIMIT
                                                  0,9999999
                Without
               condition                         REMOTE SERVER
               pushdown
Tuesday, 13 April 2010
MASTER

                                                     remote
                  SELECT * FROM sometable            server
                      WHERE col1 = 2               management




                                            SELECT col1,col2,col3
                                               FROM sometable
                                               WHERE col1 = 2
                 With                          LIMIT 0,9999999

               condition
               pushdown                          REMOTE SERVER

Tuesday, 13 April 2010
INSTALLATION (3)
                         Move the spider directory into MySQL source
                         code




       mv spider mysql-5.1.39/storage




Tuesday, 13 April 2010
INSTALLATION (4)
                         Apply the Spider patches to MySQL code




       cd mysql-5.1.39
       patch -p2 < ../mysql-5.1.39.spider.diff
       patch -p2 < 
        ../mysql-5.1.36.partition_cond_push.diff



Tuesday, 13 April 2010
INSTALLATION (5)
                         Compile MySQL code (see the docs for details)
       autoconf
       automake
       ./configure --enable-thread-safe-client 
           --enable-local-infile 
           --with-pic --with-fast-mutexes 
           --with-client-ldflags=-static 
           --with-mysqld-ldflags=-static --with-zlib-dir=bundled 
           --with-big-tables --with-ssl --with-readline 
           --with-embedded-server --with-partition 
           --with-innodb --without-ndbcluster 
           --without-archive-storage-engine 
           --without-blackhole-storage-engine 
           --with-csv-storage-engine 
           --without-example-storage-engine 
           --without-federated-storage-engine 
           --with-extra-charsets=complex && make


Tuesday, 13 April 2010
INSTALLATION (6)
                         create a binary tarball




       ./scripts/make_binary_distribution




Tuesday, 13 April 2010
INSTALLATION (7)
                         Install manually in your main server

                         OR

                         use MySQL Sandbox

       make_sandbox 
         $PWD/mysql-5.1.39-osx10.5-i386.tar.gz 
         --sandbox_directory=spider_main




Tuesday, 13 April 2010
SETUP




Tuesday, 13 April 2010
SETUP (1)
              Get the SQL from the docs

              or get it from my site

                         http://datacharmer.org/downloads/spider_setup.sql

              Run it
       cd $HOME/sandboxes/spider_main
       wget http://datacharmer.org/downloads/
       spider_setup.sql
       ./use < spider_setup.sql


Tuesday, 13 April 2010
SETUP (2)
                         Check the engines


       ./use
       select engine,support,transactions,xa
           -> from information_schema.engines;
       +------------+---------+--------------+------+
       | engine     | support | transactions | xa   |
       +------------+---------+--------------+------+
       | SPIDER     | YES     | YES          | YES |
       | MRG_MYISAM | YES     | NO           | NO   |
       | CSV        | YES     | NO           | NO   |
       | MyISAM     | DEFAULT | NO           | NO   |
       | InnoDB     | YES     | YES          | YES |
       | MEMORY     | YES     | NO           | NO   |
       +------------+---------+--------------+------+


Tuesday, 13 April 2010
USING SPIDER
                          (Simple case)



Tuesday, 13 April 2010
Preparing remote servers
                              (1)
              Create three servers using MySQL Sandbox




       make_multiple_sandbox 
         --group_directory=spider_dir 
         --sandbox_base_port=6000 
         --check_base_port 5.1.39


Tuesday, 13 April 2010
Preparing remote servers
                              (2)
              Check the port numbers


       ~/sandboxes/spider_dir/use_all 
         "show variables like 'port'"
       # server: 1:
       Variable_name Value
       port 6001
       # server: 2:
       Variable_name Value
       port 6002
       # server: 3:
       Variable_name Value
       port 6003



Tuesday, 13 April 2010
Preparing remote servers
                              (3)
              create table definition (tablea.sql)

       drop schema if exists   myspider;
       create schema myspider;
       use myspider;

       Create table tbl_a(
               col_a int,
               col_b int,
               primary key(col_a)
       );

Tuesday, 13 April 2010
Preparing remote servers
                              (4)
              create table in remote servers

       cd $HOME/sandboxes/spider_dir
       ./use_all "source tablea.sql"




Tuesday, 13 April 2010
setting the main server (1)
              create table definition (tablea_main.sql)

       drop schema if exists   myspider;
       create schema myspider;
       use myspider;

       Create table tbl_b(
               col_a int,
               col_b int,
               primary key(col_a)
       ) engine = Spider
       -- continues ...
Tuesday, 13 April 2010
setting the main server (2)
              create table definition (tablea_main.sql) (continues)

Connection ' table "tbl_a", user "msandbox",
password "msandbox" '
partition by range( col_a ) (
    partition pt1 values less than (1000)
    comment 'host "127.0.0.1", port "6001"',
    partition pt2 values less than (2000)
    comment 'host "127.0.0.1", port "6002"',
    partition pt3 values less than (MAXVALUE)
    comment 'host "127.0.0.1", port "6003"'
);
Tuesday, 13 April 2010
setting the main server (3)
              create table

./use < tablea_main.sql




Tuesday, 13 April 2010
(Finally) using it (1)
              in the main server

./use myspider
insert into tbl_b values (500,1), 
(1500,2), (5000,3);
Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0




Tuesday, 13 April 2010
(Finally) using it (2)
              in the main server

  select * from tbl_b;
+-------+-------+
| col_a | col_b |
+-------+-------+
|    500 |     1 |
| 1500 |       2 |
| 5000 |       3 |
+-------+-------+
3 rows in set (0.01 sec)

Tuesday, 13 April 2010
WHERE IS THE DATA?


                         500 < 1000 = host 1

                         1500 < 2000 = host2

                         5000 < MAXVALUE = host3




Tuesday, 13 April 2010
Looking for the data
              in the "remote" servers
 $HOME/sandboxes/spider_dir/use_all 
  "select * from myspider.tbl_a"
# server: 1:
col_a col_b
500 1
# server: 2:
col_a col_b
1500 2
# server: 3:
col_a col_b
5000 3
Tuesday, 13 April 2010
Using Spider
                         (more complex case)



Tuesday, 13 April 2010
Setting more remote
                              servers (1)
              in the main server
./use myspider
drop table tbl_b;




Tuesday, 13 April 2010
Setting more remote
                              servers (2)
              create 20 remote servers
make_multiple_sandbox 
  --how_many_nodes=20 
  --group_directory=spider_dir 
  --sandbox_base_port=6000 
  5.1.39




Tuesday, 13 April 2010
Setting more remote
                              servers (3)
              create tables for the employees database
cd $HOME/sandboxes/spider_dir
wget http://datacharmer.org/downloads/
spider_remote_employees.sql
./use_all "source spider_remote_employees.sql"

# see also http://launchpad.net/test-db




Tuesday, 13 April 2010
the test employees database
           http://launchpad.net/test-db




Tuesday, 13 April 2010
Setting the main server (1)
              create tables for the employees database
cd $HOME/sandboxes/spider_main
wget http://datacharmer.org/downloads/
spider_main_employees.sql
./use < spider_main_employees.sql

# see also http://launchpad.net/test-db




Tuesday, 13 April 2010
checking the remote
                                    servers
              see how many rows have you got after loading

cd $HOME/sandboxes/spider_dir
./use_all                "select count(*) from employees.salaries"
# server:                1:
count(*)
0
# server:                2:
count(*)
18293
# server:                3:
count(*)
37957
# server:                4:
count(*)
57440
...

Tuesday, 13 April 2010
Performance




Tuesday, 13 April 2010
Spider engine performance


                         Comparable to the gains offered by partitioning

                         (from 30 to 1000% depending on query type)

                         Load easily split across masters




Tuesday, 13 April 2010
Running remote commands
                    with Spider



Tuesday, 13 April 2010
data                  data          data   data     data



              UDF
                 CREATE TABLE
                 db.t1(i int)

                 ""      host "127.0.0.1",
                           port "6001",
                          user "msandbox"


                                               1 = success
                                               0 = failure
Tuesday, 13 April 2010
data                  data          data   data     data



              UDF
               INSERT INTO db.t1
               VALUES (1000), (2000),
               (3000)

                 ""      host "127.0.0.1",
                           port "6001",
                          user "msandbox"


                                               1 = success
                                               0 = failure
Tuesday, 13 April 2010
data                 data         data          data     data




                         CREATE TEMPORARY TABLE
                         my_temp_table (i int)
                                                  my_temp_table




Tuesday, 13 April 2010
data              data      data      data     data



              UDF
               SELECT * FROM db.t1
               "my_temp_table"
                host "127.0.0.1",
                  port "6001",
                 user "msandbox"
                                        my_temp_table

                                        1000
                                        2000
                                        3000

Tuesday, 13 April 2010
creating a remote table

            select spider_direct_sql(
            "create table myspider.t1 (id int)",
            "",
            "port '6001', host '127.0.0.1', user
            'msandbox', password 'msandbox'");

            # result: 1




Tuesday, 13 April 2010
inserting records into
                             remote table
            select spider_direct_sql(
            "insert into myspider.t1 values (1000),
            (2000),(3000)",
            "",
            "port '6001', host '127.0.0.1', user
            'msandbox', password 'msandbox'");

            # result 1




Tuesday, 13 April 2010
getting remote records (1)

      create temporary table remote6001 (i int);

      select spider_direct_sql(
      "select * from myspider.t1",
      "remote6001",
      "port '6001', host '127.0.0.1', user
      'msandbox', password 'msandbox'");

      # result: 1



Tuesday, 13 April 2010
getting remote records (2)

      select * from remote6001;
      +------+
      | i    |
      +------+
      | 1000 |
      | 2000 |
      | 3000 |
      +------+




Tuesday, 13 April 2010
Vertical
                   partitioning
Tuesday, 13 April 2010
The vertical partition engine


                         Same author of the Spider engine

                         Open source https://launchpad.net/vpformysql

                         Simple concept




Tuesday, 13 April 2010
Original table

                                        employees

                         id   name    salary dept       email

                         1    Joe      1300   1     joe@company.com

                         2    Rick     1250   4     rick@other.com

                         3    Fred     1600   11     fred@some.org




Tuesday, 13 April 2010
Split tables
                                empl2                              empl1
             id             name        salary         id   dept           email
               1               Joe          1300        1     1     joe@company.com
               2            Rick            1250        2     4     rick@other.com
               3            Fred            1600        3    11      fred@some.org



                                      employees
     id              name        salary dept                email
       1                 Joe         1300          1   joe@company.com
       2                 Rick        1250          4   rick@other.com
       3                 Fred        1600      11       fred@some.org
Tuesday, 13 April 2010
Split tables
                                empl2                              empl1
             id             name        salary         id   dept           email
               1               Joe          1300        1     1     joe@company.com
               2            Rick            1250        2     4     rick@other.com
               3            Fred            1600        3    11      fred@some.org



                                      employees
     id              name        salary dept                email
       1                 Joe         1300          1   joe@company.com
                                                                             no data
       2                 Rick        1250          4   rick@other.com
                                                                               here
       3                 Fred        1600      11       fred@some.org
Tuesday, 13 April 2010
Vertical partitioning
                                 installation
                         Similar to Spider:

                            download the MySQL source code

                            download the engine source from
                            https://launchpad.net/vpformysql

                            copy source under $basedir/storage

                            apply patch

                            compile

                            load engine
Tuesday, 13 April 2010
vertical partitioning syntax
CREATE TABLE empl1 (
   id int not null,
   name varchar(50),
   salary decimal(10,3),
   primary key (id)
);

CREATE TABLE empl2 (
   id int not null,
   dept int,
   email varchar(100),
   primary key (id)
);
Tuesday, 13 April 2010
vertical partitioning syntax
CREATE TABLE employees (
  id int not null,
  name varchar(50),
  salary decimal(10,3),
  dept int,
  email varchar(100),
  primary key (id)
) engine=VP
COMMENT='table_name_list "empl1 empl2"';




Tuesday, 13 April 2010
inserting into the vertical
                            partitioning engine
insert into employees values (1, 'Joe',
1300,1,'joe@company.com');
Query OK, 1 row affected (0.00 sec)

insert into employees values (2, 'Rick',
1250,4,'rick@other.com');
Query OK, 1 row affected (0.00 sec)




Tuesday, 13 April 2010
retrieving data from the
                 vertical partitioning engine
select * from employees;
+----+------+----------+------+-----------------+
| id | name | salary   | dept | email           |
+----+------+----------+------+-----------------+
| 1 | Joe | 1300.000 |      1 | joe@company.com |
| 2 | Rick | 1250.000 |     4 | rick@other.com |
+----+------+----------+------+-----------------+




     The data is actually in empl1 and empl2 (check the
           data directory file sizes to make sure)

Tuesday, 13 April 2010
READING MORE
                         my blog (search for Spider)

                           http://datacharmer.blogspot.com

                         home of the engines

                           http://launchpad.net/spiderformysql

                           http://launchpad.net/vpformysql

                         Look for these slides:

                           http://slideshare.net/datacharmer

Tuesday, 13 April 2010
MORE SPIDER



                         ADVANCED SPIDER TECHNIQUES

                         tomorrow, same time




Tuesday, 13 April 2010
THANKS



                                                    Q&A
                         This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License.

Tuesday, 13 April 2010
Tuesday, 13 April 2010
Tuesday, 13 April 2010
Tuesday, 13 April 2010

More Related Content

Viewers also liked

The Search For Dark Energy - 4.26.2010 - Joe Beno
The Search For Dark Energy - 4.26.2010 - Joe BenoThe Search For Dark Energy - 4.26.2010 - Joe Beno
The Search For Dark Energy - 4.26.2010 - Joe Benocahouser
 
Arachnid tutorial
Arachnid tutorialArachnid tutorial
Arachnid tutorialkameshsept
 
Spider 1.0
Spider 1.0Spider 1.0
Spider 1.0vr1988
 
HexaDuino - Flying Hexapod
HexaDuino - Flying HexapodHexaDuino - Flying Hexapod
HexaDuino - Flying HexapodKenny Fernando
 
A Bionic gait programming algorithm for Hexapod Robot
A Bionic gait programming algorithm for Hexapod RobotA Bionic gait programming algorithm for Hexapod Robot
A Bionic gait programming algorithm for Hexapod RobotHao Yuan Cheng
 
Mechanical Spider - Robotics Course Details
Mechanical Spider  - Robotics Course DetailsMechanical Spider  - Robotics Course Details
Mechanical Spider - Robotics Course DetailsAcademy of Robotics
 
Use of Ilizarov fixator Hexapod apparatus for fracture prox tibia
Use of Ilizarov fixator Hexapod apparatus for fracture prox tibiaUse of Ilizarov fixator Hexapod apparatus for fracture prox tibia
Use of Ilizarov fixator Hexapod apparatus for fracture prox tibiamangalparihar
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for DrupalChris Caple
 
Six Legged Walking Mechanism
Six Legged Walking MechanismSix Legged Walking Mechanism
Six Legged Walking Mechanismhassanmehdi97
 
Hexapod - Robot
Hexapod - RobotHexapod - Robot
Hexapod - Robotjojothish
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solrpittaya
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Robots presentation
Robots presentationRobots presentation
Robots presentationaroobkazim
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 

Viewers also liked (18)

The Search For Dark Energy - 4.26.2010 - Joe Beno
The Search For Dark Energy - 4.26.2010 - Joe BenoThe Search For Dark Energy - 4.26.2010 - Joe Beno
The Search For Dark Energy - 4.26.2010 - Joe Beno
 
Arachnid tutorial
Arachnid tutorialArachnid tutorial
Arachnid tutorial
 
Spider 1.0
Spider 1.0Spider 1.0
Spider 1.0
 
Hexapod Ecology
Hexapod EcologyHexapod Ecology
Hexapod Ecology
 
HexaDuino - Flying Hexapod
HexaDuino - Flying HexapodHexaDuino - Flying Hexapod
HexaDuino - Flying Hexapod
 
A Bionic gait programming algorithm for Hexapod Robot
A Bionic gait programming algorithm for Hexapod RobotA Bionic gait programming algorithm for Hexapod Robot
A Bionic gait programming algorithm for Hexapod Robot
 
Mechanical Spider - Robotics Course Details
Mechanical Spider  - Robotics Course DetailsMechanical Spider  - Robotics Course Details
Mechanical Spider - Robotics Course Details
 
Use of Ilizarov fixator Hexapod apparatus for fracture prox tibia
Use of Ilizarov fixator Hexapod apparatus for fracture prox tibiaUse of Ilizarov fixator Hexapod apparatus for fracture prox tibia
Use of Ilizarov fixator Hexapod apparatus for fracture prox tibia
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for Drupal
 
Hexapod Presentation
Hexapod PresentationHexapod Presentation
Hexapod Presentation
 
Six Legged Walking Mechanism
Six Legged Walking MechanismSix Legged Walking Mechanism
Six Legged Walking Mechanism
 
Hexapod - Robot
Hexapod - RobotHexapod - Robot
Hexapod - Robot
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solr
 
Scaling Solr with Solr Cloud
Scaling Solr with Solr CloudScaling Solr with Solr Cloud
Scaling Solr with Solr Cloud
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
P1131131674
P1131131674P1131131674
P1131131674
 
Robots presentation
Robots presentationRobots presentation
Robots presentation
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 

More from Giuseppe Maxia

MySQL NDB 8.0 clusters in your laptop with dbdeployer
MySQL NDB 8.0 clusters in your laptop with dbdeployerMySQL NDB 8.0 clusters in your laptop with dbdeployer
MySQL NDB 8.0 clusters in your laptop with dbdeployerGiuseppe Maxia
 
Dbdeployer, the universal installer
Dbdeployer, the universal installerDbdeployer, the universal installer
Dbdeployer, the universal installerGiuseppe Maxia
 
Test complex database systems in your laptop with dbdeployer
Test complex database systems in your laptop with dbdeployerTest complex database systems in your laptop with dbdeployer
Test complex database systems in your laptop with dbdeployerGiuseppe Maxia
 
A quick tour of Mysql 8 roles
A quick tour of Mysql 8 rolesA quick tour of Mysql 8 roles
A quick tour of Mysql 8 rolesGiuseppe Maxia
 
Synchronise your data between MySQL and MongoDB
Synchronise your data between MySQL and MongoDBSynchronise your data between MySQL and MongoDB
Synchronise your data between MySQL and MongoDBGiuseppe Maxia
 
Juggle your data with Tungsten Replicator
Juggle your data with Tungsten ReplicatorJuggle your data with Tungsten Replicator
Juggle your data with Tungsten ReplicatorGiuseppe Maxia
 
Tungsten Replicator tutorial
Tungsten Replicator tutorialTungsten Replicator tutorial
Tungsten Replicator tutorialGiuseppe Maxia
 
Preventing multi master conflicts with tungsten
Preventing multi master conflicts with tungstenPreventing multi master conflicts with tungsten
Preventing multi master conflicts with tungstenGiuseppe Maxia
 
MySQL high availability power and usability
MySQL high availability power and usabilityMySQL high availability power and usability
MySQL high availability power and usabilityGiuseppe Maxia
 
Solving MySQL replication problems with Tungsten
Solving MySQL replication problems with TungstenSolving MySQL replication problems with Tungsten
Solving MySQL replication problems with TungstenGiuseppe Maxia
 
State of the art of MySQL replication and clustering
State of the art of MySQL replication and clusteringState of the art of MySQL replication and clustering
State of the art of MySQL replication and clusteringGiuseppe Maxia
 
Testing mysql creatively in a sandbox
Testing mysql creatively in a sandboxTesting mysql creatively in a sandbox
Testing mysql creatively in a sandboxGiuseppe Maxia
 
Mysql 5.5 and 5.6 replication
Mysql 5.5 and 5.6 replicationMysql 5.5 and 5.6 replication
Mysql 5.5 and 5.6 replicationGiuseppe Maxia
 

More from Giuseppe Maxia (20)

MySQL NDB 8.0 clusters in your laptop with dbdeployer
MySQL NDB 8.0 clusters in your laptop with dbdeployerMySQL NDB 8.0 clusters in your laptop with dbdeployer
MySQL NDB 8.0 clusters in your laptop with dbdeployer
 
Test like a_boss
Test like a_bossTest like a_boss
Test like a_boss
 
Dbdeployer, the universal installer
Dbdeployer, the universal installerDbdeployer, the universal installer
Dbdeployer, the universal installer
 
Test complex database systems in your laptop with dbdeployer
Test complex database systems in your laptop with dbdeployerTest complex database systems in your laptop with dbdeployer
Test complex database systems in your laptop with dbdeployer
 
Dbdeployer
DbdeployerDbdeployer
Dbdeployer
 
Dbdeployer
DbdeployerDbdeployer
Dbdeployer
 
A quick tour of Mysql 8 roles
A quick tour of Mysql 8 rolesA quick tour of Mysql 8 roles
A quick tour of Mysql 8 roles
 
MySQL document_store
MySQL document_storeMySQL document_store
MySQL document_store
 
Replication skeptic
Replication skepticReplication skeptic
Replication skeptic
 
Synchronise your data between MySQL and MongoDB
Synchronise your data between MySQL and MongoDBSynchronise your data between MySQL and MongoDB
Synchronise your data between MySQL and MongoDB
 
Juggle your data with Tungsten Replicator
Juggle your data with Tungsten ReplicatorJuggle your data with Tungsten Replicator
Juggle your data with Tungsten Replicator
 
MySQL in your laptop
MySQL in your laptopMySQL in your laptop
MySQL in your laptop
 
Script it
Script itScript it
Script it
 
Tungsten Replicator tutorial
Tungsten Replicator tutorialTungsten Replicator tutorial
Tungsten Replicator tutorial
 
Preventing multi master conflicts with tungsten
Preventing multi master conflicts with tungstenPreventing multi master conflicts with tungsten
Preventing multi master conflicts with tungsten
 
MySQL high availability power and usability
MySQL high availability power and usabilityMySQL high availability power and usability
MySQL high availability power and usability
 
Solving MySQL replication problems with Tungsten
Solving MySQL replication problems with TungstenSolving MySQL replication problems with Tungsten
Solving MySQL replication problems with Tungsten
 
State of the art of MySQL replication and clustering
State of the art of MySQL replication and clusteringState of the art of MySQL replication and clustering
State of the art of MySQL replication and clustering
 
Testing mysql creatively in a sandbox
Testing mysql creatively in a sandboxTesting mysql creatively in a sandbox
Testing mysql creatively in a sandbox
 
Mysql 5.5 and 5.6 replication
Mysql 5.5 and 5.6 replicationMysql 5.5 and 5.6 replication
Mysql 5.5 and 5.6 replication
 

Recently uploaded

20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdfJamie (Taka) Wang
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 

Recently uploaded (20)

20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 

Sharding for the masses

  • 1. Sharding for the masses Introducing the Spider storage engine, and more Giuseppe Maxia Kentoku Shiba This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License. Tuesday, 13 April 2010
  • 2. What is sharding? "shard" is a piece of broken ceramic or glass "Sharding" means breaking a database to pieces Tuesday, 13 April 2010
  • 4. WHY SHARDING? SCALING SCALING SCALING SCALING SCALING Tuesday, 13 April 2010
  • 5. Scaling: the problem You start with one server Too much data Too much traffic Now what? Tuesday, 13 April 2010
  • 6. Scaling: the solution The MySQL way Also known as the Yahoo and Google way Replication Tuesday, 13 April 2010
  • 7. Replication: how it works Master WRITE client slave slave READ load balancer slave Tuesday, 13 April 2010
  • 8. Replication: how it scales Master WRITE client slave slave client slave slave slave slave slave slave client client client client READ load balancer slave slave slave slave Tuesday, 13 April 2010
  • 9. Replication: how it chokes Master > data WRITE client slave slave client slave slave slave slave slave slave client client client client READ load balancer slave slave slave slave Tuesday, 13 April 2010
  • 10. Homemade sharding client Master1 rule1 WRITE READ Master2 rule2 application logic MasterN MasterN MasterN MasterN ruleN ruleN ruleN ruleN Tuesday, 13 April 2010
  • 11. How homemade sharding breaks client Master1 rule1 WRITE READ Master2 rule2 application logic MasterN MasterN MasterN MasterN ruleN ruleN ruleN ruleN Tuesday, 13 April 2010
  • 12. How homemade sharding breaks client Master1 rule1 WRITE READ Master2 rule2 application logic MasterN MasterN MasterN MasterN ruleN ruleN ruleN ruleN Tuesday, 13 April 2010
  • 13. How homemade sharding breaks client WRITE READ application logic Master1 Master2 MasterN MasterN rule1 MasterN rule2 MasterN ruleN ruleN ruleN ruleN Tuesday, 13 April 2010
  • 14. How homemade sharding breaks (2) client Master1 rule1 rule1 rule1 rule1 rule1 WRITE READ Master2 rule2 rule2 application rule2 rule2 rule2 logic MasterN MasterN MasterN MasterN ruleN ruleN ruleN ruleN ruleNruleN ruleN ruleN Tuesday, 13 April 2010
  • 15. How homemade sharding breaks (2) client Master1 rule1 rule1 rule1 rule1 rule1 WRITE READ Master2 rule2 rule2 rule2 rule2 rule2 application logic MasterN MasterN MasterN MasterN ruleN ruleN ruleN ruleN ruleNruleN ruleN ruleN Tuesday, 13 April 2010
  • 16. The quest for magic sharding Tuesday, 13 April 2010
  • 17. The quest for magic sharding MySQL Proxy HSCALE SpockProxy Tuesday, 13 April 2010
  • 18. The quest for magic sharding MySQL Proxy HSCALE SpockProxy DON'T SCALE (SPoF) Tuesday, 13 April 2010
  • 19. Horizontal partitioning Tuesday, 13 April 2010
  • 20. Introducing Spider A MySQL storage engine Developed by Kentoku Shiba Built on top of the partitions engine Associates a partition with a remote server Transparent to user Easy to expand Independent from application Tuesday, 13 April 2010
  • 21. Note about partitions A feature introduced in MySQL 5.1 Horizontal partitioning Transparent to users Increases insertion and selection performance presentations http://tinyurl.com/mysql-partition-tut http://tinyurl.com/mysql-partition-perf Tuesday, 13 April 2010
  • 22. host2 MySQL server without SPIDER Spider conceptual model host3 MySQL server without SPIDER table employees host4 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host5 MySQL server without partition 4 year 2000 host5 SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 host6 MySQL server without SPIDER host1 MySQL server with SPIDER host7 MySQL server without SPIDER Tuesday, 13 April 2010
  • 23. host2 MySQL server without SPIDER Spider conceptual model host3 MySQL server without SPIDER table employees host4 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host5 MySQL server without partition 4 year 2000 host5 SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 host6 MySQL server without SPIDER host1 MySQL server with SPIDER host7 MySQL server no without SPIDER data here Tuesday, 13 April 2010
  • 24. select * from employees where date = '1998-01-01' host3 MySQL server without SPIDER table employees host4 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host5 MySQL server without partition 4 year 2000 host5 SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 host6 MySQL server without SPIDER host1 MySQL server with SPIDER host7 MySQL server without SPIDER Tuesday, 13 April 2010
  • 25. host2 MySQL server without select * from SPIDER employees where date = '1998-01-01' host3 MySQL server without SPIDER table employees host4 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host5 MySQL server without partition 4 year 2000 host5 SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 host6 MySQL server without SPIDER host1 MySQL server with SPIDER host7 MySQL server without SPIDER Tuesday, 13 April 2010
  • 26. host2 MySQL server without SPIDER host3 MySQL server without SPIDER table employees host4 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host5 MySQL server without partition 4 year 2000 host5 SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 host6 MySQL server without SPIDER host1 MySQL server with SPIDER host7 MySQL server without SPIDER Tuesday, 13 April 2010
  • 27. select * from host2 MySQL server without employees where date = SPIDER '1998-01-01' limit 0, 9223372036854775807 host3 MySQL server without SPIDER table employees host4 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host5 MySQL server without partition 4 year 2000 host5 SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 host6 MySQL server without SPIDER host1 MySQL server with SPIDER host7 MySQL server without SPIDER Tuesday, 13 April 2010
  • 28. host2 MySQL server without SPIDER host3 MySQL server without SPIDER table employees host4 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host5 MySQL server without partition 4 year 2000 host5 SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 host6 MySQL server without SPIDER host1 MySQL server with SPIDER host7 MySQL server without SPIDER Tuesday, 13 April 2010
  • 29. host8 MySQL server with SPIDER host2 MySQL server without SPIDER table employees partition 1 year 1997 host2 host3 MySQL server partition 2 year 1998 host3 without SPIDER partition 3 year 1999 host4 partition 4 year 2000 host5 host4 MySQL server partition 5 year 2001 host6 without SPIDER partition 6 year 2002 host7 host5 MySQL server host1 MySQL server with SPIDER without SPIDER table employees host6 MySQL server partition 1 year 1997 host2 without SPIDER partition 2 year 1998 host3 partition 3 year 1999 host4 host7 MySQL server partition 4 year 2000 host5 without SPIDER partition 5 year 2001 host6 partition 6 year 2002 host7 Tuesday, 13 April 2010
  • 30. data data data data data Tuesday, 13 April 2010
  • 31. data data data data data Tuesday, 13 April 2010
  • 32. data data data data data Tuesday, 13 April 2010
  • 33. data data data data data Tuesday, 13 April 2010
  • 34. data data data data data Tuesday, 13 April 2010
  • 35. data data data data data client Tuesday, 13 April 2010
  • 36. data data data data data client client client client Tuesday, 13 April 2010
  • 37. data data data data data HA HA HA HA HA data data data data data Tuesday, 13 April 2010
  • 39. INSTALLATION (1) Get the source code for MySQL 5.1.39 http://dev.mysql.com/downloads Get the source code for Spider 2.5 http://launchpad.net/spiderformysql Get the patch for condition pushdown https://launchpad.net/ partitionconditionpushdownformysql Tuesday, 13 April 2010
  • 40. INSTALLATION (2) Unpack MySQL source code Unpack Spider source code and docs Unpack the condition pushdown patch mkdir spider cd spider tar -xzf mysql-5.1.39.tar.gz tar -xzf spider-src-1.0-for-5.1.39.tgz tar -xzf spider-doc-1.0-for-5.1.39.tgz tar -xzf partition_cond_push-0.1- for-5.1.36.tgz Tuesday, 13 April 2010
  • 41. / l .' : __.....__..._ ____ / / _.-" "-. "" "-. (`-: .---: .--.' _....J. "-. """y ,.' __..--"" `+""--. `. : .'/ .-"""-. _. `. "-. `._.._ ; _.'.' .-j `. "-. "-._`. : / .-" : `-. `- "-. ; /.' ; :; ." `, :_:/ :: ;: ( / .-" .') ; STOP! ;-" ; "-. / ; .^. .' .' / .-" / .- : `. '. : .- / __.-j.'.' .-" /.---' / / `,. .' "":' /-" .' __.' : : ,"" ; .' .' .-"" _J ; ; `. /.' _/ .-" / "-: /"--.b-..-' .' ; / / ""-..' .--'.-'/ , : :`. : / : `-i" ,',_: _ : '._ :__; .'.-"; ; ; j `.l "-._ `" :_/ :_/ `.; "-._ What the hell is a :_"-._ "-. `. l "-. ) `. ""^--""^-. : "; : `._ "condition ; / `._ ""---. / / _ `.--.__.' : : / ; :". ; ; : : ; `. `. / ; : ; : `. `. / /: ; : ; "-' pushdown"? :_.' ; ; ; : / / :_l `-' Tuesday, 13 April 2010
  • 42. MASTER remote SELECT * FROM sometable server WHERE col1 = 2 management SELECT col1,col2,col3 FROM sometable LIMIT 0,9999999 Without condition REMOTE SERVER pushdown Tuesday, 13 April 2010
  • 43. MASTER remote SELECT * FROM sometable server WHERE col1 = 2 management SELECT col1,col2,col3 FROM sometable WHERE col1 = 2 With LIMIT 0,9999999 condition pushdown REMOTE SERVER Tuesday, 13 April 2010
  • 44. INSTALLATION (3) Move the spider directory into MySQL source code mv spider mysql-5.1.39/storage Tuesday, 13 April 2010
  • 45. INSTALLATION (4) Apply the Spider patches to MySQL code cd mysql-5.1.39 patch -p2 < ../mysql-5.1.39.spider.diff patch -p2 < ../mysql-5.1.36.partition_cond_push.diff Tuesday, 13 April 2010
  • 46. INSTALLATION (5) Compile MySQL code (see the docs for details) autoconf automake ./configure --enable-thread-safe-client --enable-local-infile --with-pic --with-fast-mutexes --with-client-ldflags=-static --with-mysqld-ldflags=-static --with-zlib-dir=bundled --with-big-tables --with-ssl --with-readline --with-embedded-server --with-partition --with-innodb --without-ndbcluster --without-archive-storage-engine --without-blackhole-storage-engine --with-csv-storage-engine --without-example-storage-engine --without-federated-storage-engine --with-extra-charsets=complex && make Tuesday, 13 April 2010
  • 47. INSTALLATION (6) create a binary tarball ./scripts/make_binary_distribution Tuesday, 13 April 2010
  • 48. INSTALLATION (7) Install manually in your main server OR use MySQL Sandbox make_sandbox $PWD/mysql-5.1.39-osx10.5-i386.tar.gz --sandbox_directory=spider_main Tuesday, 13 April 2010
  • 50. SETUP (1) Get the SQL from the docs or get it from my site http://datacharmer.org/downloads/spider_setup.sql Run it cd $HOME/sandboxes/spider_main wget http://datacharmer.org/downloads/ spider_setup.sql ./use < spider_setup.sql Tuesday, 13 April 2010
  • 51. SETUP (2) Check the engines ./use select engine,support,transactions,xa -> from information_schema.engines; +------------+---------+--------------+------+ | engine | support | transactions | xa | +------------+---------+--------------+------+ | SPIDER | YES | YES | YES | | MRG_MYISAM | YES | NO | NO | | CSV | YES | NO | NO | | MyISAM | DEFAULT | NO | NO | | InnoDB | YES | YES | YES | | MEMORY | YES | NO | NO | +------------+---------+--------------+------+ Tuesday, 13 April 2010
  • 52. USING SPIDER (Simple case) Tuesday, 13 April 2010
  • 53. Preparing remote servers (1) Create three servers using MySQL Sandbox make_multiple_sandbox --group_directory=spider_dir --sandbox_base_port=6000 --check_base_port 5.1.39 Tuesday, 13 April 2010
  • 54. Preparing remote servers (2) Check the port numbers ~/sandboxes/spider_dir/use_all "show variables like 'port'" # server: 1: Variable_name Value port 6001 # server: 2: Variable_name Value port 6002 # server: 3: Variable_name Value port 6003 Tuesday, 13 April 2010
  • 55. Preparing remote servers (3) create table definition (tablea.sql) drop schema if exists myspider; create schema myspider; use myspider; Create table tbl_a( col_a int, col_b int, primary key(col_a) ); Tuesday, 13 April 2010
  • 56. Preparing remote servers (4) create table in remote servers cd $HOME/sandboxes/spider_dir ./use_all "source tablea.sql" Tuesday, 13 April 2010
  • 57. setting the main server (1) create table definition (tablea_main.sql) drop schema if exists myspider; create schema myspider; use myspider; Create table tbl_b( col_a int, col_b int, primary key(col_a) ) engine = Spider -- continues ... Tuesday, 13 April 2010
  • 58. setting the main server (2) create table definition (tablea_main.sql) (continues) Connection ' table "tbl_a", user "msandbox", password "msandbox" ' partition by range( col_a ) ( partition pt1 values less than (1000) comment 'host "127.0.0.1", port "6001"', partition pt2 values less than (2000) comment 'host "127.0.0.1", port "6002"', partition pt3 values less than (MAXVALUE) comment 'host "127.0.0.1", port "6003"' ); Tuesday, 13 April 2010
  • 59. setting the main server (3) create table ./use < tablea_main.sql Tuesday, 13 April 2010
  • 60. (Finally) using it (1) in the main server ./use myspider insert into tbl_b values (500,1), (1500,2), (5000,3); Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 Tuesday, 13 April 2010
  • 61. (Finally) using it (2) in the main server select * from tbl_b; +-------+-------+ | col_a | col_b | +-------+-------+ | 500 | 1 | | 1500 | 2 | | 5000 | 3 | +-------+-------+ 3 rows in set (0.01 sec) Tuesday, 13 April 2010
  • 62. WHERE IS THE DATA? 500 < 1000 = host 1 1500 < 2000 = host2 5000 < MAXVALUE = host3 Tuesday, 13 April 2010
  • 63. Looking for the data in the "remote" servers $HOME/sandboxes/spider_dir/use_all "select * from myspider.tbl_a" # server: 1: col_a col_b 500 1 # server: 2: col_a col_b 1500 2 # server: 3: col_a col_b 5000 3 Tuesday, 13 April 2010
  • 64. Using Spider (more complex case) Tuesday, 13 April 2010
  • 65. Setting more remote servers (1) in the main server ./use myspider drop table tbl_b; Tuesday, 13 April 2010
  • 66. Setting more remote servers (2) create 20 remote servers make_multiple_sandbox --how_many_nodes=20 --group_directory=spider_dir --sandbox_base_port=6000 5.1.39 Tuesday, 13 April 2010
  • 67. Setting more remote servers (3) create tables for the employees database cd $HOME/sandboxes/spider_dir wget http://datacharmer.org/downloads/ spider_remote_employees.sql ./use_all "source spider_remote_employees.sql" # see also http://launchpad.net/test-db Tuesday, 13 April 2010
  • 68. the test employees database http://launchpad.net/test-db Tuesday, 13 April 2010
  • 69. Setting the main server (1) create tables for the employees database cd $HOME/sandboxes/spider_main wget http://datacharmer.org/downloads/ spider_main_employees.sql ./use < spider_main_employees.sql # see also http://launchpad.net/test-db Tuesday, 13 April 2010
  • 70. checking the remote servers see how many rows have you got after loading cd $HOME/sandboxes/spider_dir ./use_all "select count(*) from employees.salaries" # server: 1: count(*) 0 # server: 2: count(*) 18293 # server: 3: count(*) 37957 # server: 4: count(*) 57440 ... Tuesday, 13 April 2010
  • 72. Spider engine performance Comparable to the gains offered by partitioning (from 30 to 1000% depending on query type) Load easily split across masters Tuesday, 13 April 2010
  • 73. Running remote commands with Spider Tuesday, 13 April 2010
  • 74. data data data data data UDF CREATE TABLE db.t1(i int) "" host "127.0.0.1", port "6001", user "msandbox" 1 = success 0 = failure Tuesday, 13 April 2010
  • 75. data data data data data UDF INSERT INTO db.t1 VALUES (1000), (2000), (3000) "" host "127.0.0.1", port "6001", user "msandbox" 1 = success 0 = failure Tuesday, 13 April 2010
  • 76. data data data data data CREATE TEMPORARY TABLE my_temp_table (i int) my_temp_table Tuesday, 13 April 2010
  • 77. data data data data data UDF SELECT * FROM db.t1 "my_temp_table" host "127.0.0.1", port "6001", user "msandbox" my_temp_table 1000 2000 3000 Tuesday, 13 April 2010
  • 78. creating a remote table select spider_direct_sql( "create table myspider.t1 (id int)", "", "port '6001', host '127.0.0.1', user 'msandbox', password 'msandbox'"); # result: 1 Tuesday, 13 April 2010
  • 79. inserting records into remote table select spider_direct_sql( "insert into myspider.t1 values (1000), (2000),(3000)", "", "port '6001', host '127.0.0.1', user 'msandbox', password 'msandbox'"); # result 1 Tuesday, 13 April 2010
  • 80. getting remote records (1) create temporary table remote6001 (i int); select spider_direct_sql( "select * from myspider.t1", "remote6001", "port '6001', host '127.0.0.1', user 'msandbox', password 'msandbox'"); # result: 1 Tuesday, 13 April 2010
  • 81. getting remote records (2) select * from remote6001; +------+ | i | +------+ | 1000 | | 2000 | | 3000 | +------+ Tuesday, 13 April 2010
  • 82. Vertical partitioning Tuesday, 13 April 2010
  • 83. The vertical partition engine Same author of the Spider engine Open source https://launchpad.net/vpformysql Simple concept Tuesday, 13 April 2010
  • 84. Original table employees id name salary dept email 1 Joe 1300 1 joe@company.com 2 Rick 1250 4 rick@other.com 3 Fred 1600 11 fred@some.org Tuesday, 13 April 2010
  • 85. Split tables empl2 empl1 id name salary id dept email 1 Joe 1300 1 1 joe@company.com 2 Rick 1250 2 4 rick@other.com 3 Fred 1600 3 11 fred@some.org employees id name salary dept email 1 Joe 1300 1 joe@company.com 2 Rick 1250 4 rick@other.com 3 Fred 1600 11 fred@some.org Tuesday, 13 April 2010
  • 86. Split tables empl2 empl1 id name salary id dept email 1 Joe 1300 1 1 joe@company.com 2 Rick 1250 2 4 rick@other.com 3 Fred 1600 3 11 fred@some.org employees id name salary dept email 1 Joe 1300 1 joe@company.com no data 2 Rick 1250 4 rick@other.com here 3 Fred 1600 11 fred@some.org Tuesday, 13 April 2010
  • 87. Vertical partitioning installation Similar to Spider: download the MySQL source code download the engine source from https://launchpad.net/vpformysql copy source under $basedir/storage apply patch compile load engine Tuesday, 13 April 2010
  • 88. vertical partitioning syntax CREATE TABLE empl1 ( id int not null, name varchar(50), salary decimal(10,3), primary key (id) ); CREATE TABLE empl2 ( id int not null, dept int, email varchar(100), primary key (id) ); Tuesday, 13 April 2010
  • 89. vertical partitioning syntax CREATE TABLE employees ( id int not null, name varchar(50), salary decimal(10,3), dept int, email varchar(100), primary key (id) ) engine=VP COMMENT='table_name_list "empl1 empl2"'; Tuesday, 13 April 2010
  • 90. inserting into the vertical partitioning engine insert into employees values (1, 'Joe', 1300,1,'joe@company.com'); Query OK, 1 row affected (0.00 sec) insert into employees values (2, 'Rick', 1250,4,'rick@other.com'); Query OK, 1 row affected (0.00 sec) Tuesday, 13 April 2010
  • 91. retrieving data from the vertical partitioning engine select * from employees; +----+------+----------+------+-----------------+ | id | name | salary | dept | email | +----+------+----------+------+-----------------+ | 1 | Joe | 1300.000 | 1 | joe@company.com | | 2 | Rick | 1250.000 | 4 | rick@other.com | +----+------+----------+------+-----------------+ The data is actually in empl1 and empl2 (check the data directory file sizes to make sure) Tuesday, 13 April 2010
  • 92. READING MORE my blog (search for Spider) http://datacharmer.blogspot.com home of the engines http://launchpad.net/spiderformysql http://launchpad.net/vpformysql Look for these slides: http://slideshare.net/datacharmer Tuesday, 13 April 2010
  • 93. MORE SPIDER ADVANCED SPIDER TECHNIQUES tomorrow, same time Tuesday, 13 April 2010
  • 94. THANKS Q&A This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License. Tuesday, 13 April 2010