Saturday, November 8, 2014

Open source India Conference: MySQL High Availability with Replication New Features

The session was presented at open source India 2014 (http://osidays.com/osidays/) by Shivji (me) and Manish Kumar. It talks of the new features in MySQL-5.7 Replication. It covered work on
  1. Performance enhancements in MySQL Replication 
  2. Usability improvements 
  3. More flexibility to provide more options to our users so 
  4. They can chose what is best for their application. 
  5. Semi-synchronous and MySQL Group Replication 
At then end, there are a lot of links to the blogs written on these features by the MySQL Replication engineers.

Thursday, October 16, 2014

MySQL 5.7.5- More variables in replication performance_schema tables

At MySQL, replication usability is of utmost importance to us. Replication information has long been part of SHOW commands, SHOW SLAVE STATUS occupying a major chunk of it. The other sources of replication information being:
As the replication module grows further, there is a lot more monitoring information, so much that the present interfaces seem too rigid to accommodate all the information we would like to present. So we need to organize them in a more structured manner. In MySQL-5.7.2, we introduced replication performance_schema (P_S) tables providing an SQL interface to monitor replication configuration and status partitioned into different tables, each table grouping logically related information. You can read more about the contents of these tables from official MySQL documentation.
In MySQL-5.7.5 we added some more status variables to these performance_schema tables to enable monitoring the latest replication features viz. multi-source replication in labs. Multi-source allows a MySQL slave to replicate from multiple sources (masters) directly. Talking of multi-source, one needs the replication information per source. So we added global variables that would be useful to extend to per-source scope to the replication performance\_schema tables to help monitor multi-source replication. Note that these variables still work for the single sourced replication and can still be accessed as:
Show status like 'Slave_running';
Show status like 'Slave_retried_transactions';
Show status like 'Slave_last_heartbeat';
Show status like 'Slave_received_heartbeats';
show status like 'Slave_heartbeat_period';
Note though that the status variables are now mostly useful in single-source mode ONLY. If more sources are added, the status variables still just apply to the first source. For other replication sources (masters), the only way to access these variables is to use the replication performance_schema tables as named in the table below. Here is how the names of server variables map to the names in the replication performance_schema tables:
VARIABLE NAMEP_S TABLE NAMEP_S FIELD NAME
 SLAVE_HEARTBEAT_PERIOD replication_connection_configuration HEARTBEAT_INTERVAL
 SLAVE_RECEIVED_HEARTBEATS replication_connection_status COUNT_RECEIVED_HEARTBEATS
 SLAVE_LAST_HEARTBEAT replication_connection_status LAST_HEARTBEAT_TIMESTAMP
 SLAVE_RETRIED_TRANSACTIONS replication_execute_status COUNT_TRANSACTIONS_RETRIES
The variable 'slave_running' reports whether the slave is running or not. This can be found by inspecting the two replication components (receiver and applier) separately to see if the receiver module is running or not by executing
SELECT SERVICE_STATE FROM performance_schema.replication_connection_status;
And execute module is running or not by executing
SERVICE_STATE FROM performance_schema.replication_execute_status;
Please try out multi-source replication and our new monitoring interface in the form of replication performance_schema tables. As always, your feedback is very valuable to us.

Tuesday, April 8, 2014

MySQL Developers Conference: MySQL Replication and Scalability

The slide deck contains the latest developments in MySQL Replication. It covers:
- An introduction to MySQL Replication
- Scaling with Multi-threaded slaves
- Data aggregation with Multi-source replication
- Lossless failover with semi-synchronous replication
- Replication Monitoring made easier

Tuesday, April 1, 2014

MySQL-5.7.4- Change master without stopping slave altogether

At MySQL, we have been working on simplifying the failover process making it faster, more flexible and easier to use. MySQL-5.6 came up with Global Transaction Identifiers (GTID) which was a huge leap in the direction of easing the failover process hiding the details about replication logs and positions. With MySQL-5.7.4, we are introducing a new feature that further adds to flexibility and onliness- the user can only shut down components that he needs to re-configure.

What we allow with this new feature is to execute CHANGE MASTER TO command without stopping slave altogether. We realized that stopping slave altogether is not mandatory in all cases and doing that was more of a cautious approach to switching master restricting more than what’s required at times.

Lets dive deep into this to see what can be relaxed here. For this, lets break the replication process into two modules:
M1) Receiver module (concerned with IO thread) and
M2) Applier module (concerning SQL thread or coordinator and worker threads, whichever be the case)

We can now divide options under the command CHANGE MASTER TO into three groups based on the above classification: 

G1) Options that change a receiver configuration. 
G2) Options that change an applier configuration. 
G3) Options that relate to both (1) and (2). 

For the precise division look at the picture below. Note that the illustration takes into account all the CHANGER MASTER TO options present currently (MySQL-5.7.4). 




Note that given its current usage, we could put the MASTER_AUTO_POSITION  option under group G1(i.e., receive side). Currently only the receiver module uses GTID positioning but we foresee that in future it will be good to allow the applier module to use GTID positioning. We thus keep the master_auto_position option under group G3 to keep things future-proof. Worried that obstructs the failover process again like before? Well that's not a problem as MASTER_AUTO_POSITION is a slave's configuration that you only set once. Then it affects all future times that you redirect to a new immediate master. So you don't specify it on fail-over.

With the classifications stated above, we propose a 3-point rule stated as:
R1) For CHANGE MASTER TO options under group G1, stop only receiver module (M1) using the command STOP SLAVE IO_THREAD command.
R2) For CHANGE MASTER TO options under group G2, stop only applier module (M2) using the command STOP SLAVE SQL_THREAD command.
R3) For CHANGE MASTER TO options under group G3, stop both receiver (M1) and
applier modules (M2) using the command STOP SLAVE.

HOW DOES THIS RULE WORK?
Lets explore more about our 3-point rule(R1-R3):

  • Starting with rule R1, we stated that we only need to stop the receiver thread to change receive options. What happens to the applier module? Well the applier module keeps applying pending transactions, if any. If you have a situation where the slave was lagging behind with a lot of transactions queued into the slave's logs, you can allow the applier module to catch up while you switch masters or change a configuration on the receive side keeping the master same.
  • Under rule R2, we stated that we can change configuration of applier module after stopping the applier threads ONLY, receiver module can be running while you do this. So while you fine-tune your slave applier module the receiver module keeps reading master's log and copying transactions to the slave's log. These could then be applied in-parallel by the slave when the applier module is up and running.
  • Under rule R3, you stop both receiver and applier modules. So, this is analogous to
    STOP SLAVE;
    CHANGE MASTER TO <master_def>;
    used before this feature was available.

Worried how relay log purge would be handled now? Well its pretty simple- Under rules R1 and R2, we do not purge logs implicitly on executing the CHANGE MASTER command so that the receiver or applier whichever is running just keeps processing/adding relay logs as it would do if no replication thread was stopped.

Finally note that you need not always look at the figure above to find which options are allowed which thread being stopped. You just need to ask yourself if the parameter is related to receiver thread and stop the concerned thread. And if you go wrong there are error messages to guide you on the right path. Look at the next section for the usage and the errors.

EXAMPLES OF USAGE
example 1:
Previously, to change master heartbeat period, you would do a

STOP SLAVE;
CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD= <value>;
START SLAVE;

Now, with this feature you just have to stop the receiver (io) thread as heartbeat has nothing to do with the applier thread(s).
 
STOP SLAVE IO_THREAD;
CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD= <value>;
START SLAVE IO_THREAD;

Note that the applier thread keeps executing the transactions in the relay log while you change the heartbeat period for the master-slave connection. Likewise, you could do this with all the attributes mentioned in group G1 in the figure above.

example 2:
Similarly, to change applier thread attributes, you just have to stop the applier threads.
So instead of

STOP SLAVE;
CHANGE MASTER TO MASTER_DELAY=<value>;
START SLAVE;

it is enough to do the following with this feature.

STOP SLAVE SQL_THREAD;
CHANGE MASTER TO MASTER_DELAY=<value>;
START SLAVE SQL_THREAD;

Lastly, if you go wrong there are nicely worded error message to guide you. So in the first case, if your receiver module is active and you execute a

CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD= <value>;

 you get an error saying:
This operation cannot be performed with a running slave io thread; run STOP SLAVE IO_THREAD first.


and if you forgot changing applier module when it was required, the server will say:

This operation cannot be performed with a running slave sql thread; run STOP SLAVE SQL_THREAD first.


Lastly you still have the error message saying both the threads should stop appearing only for MASTER_AUTO_POSITION option now:
MASTER
This operation cannot be performed with a running slave; run STOP SLAVE first.

Lets see some examples once again:

example 3:
slave>START SLAVE;
slave>CHANGE MASTER TO MASTER_DELAY= 10;
ERROR 1900 (HY000): This operation cannot be performed with a running slave sql thread; run STOP SLAVE SQL_THREAD first

example 4:
mysql> CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD= 10;
ERROR 1904 (HY000): This operation cannot be performed with a running slave io thread; run STOP SLAVE IO_THREAD first.

example 5:
mysql> CHANGE MASTER TO MASTER_AUTO_POSITION= 0;
ERROR 1198 (HY000): This operation cannot be performed with a running slave; run STOP SLAVE first

SIDE-EFFECTS?

While implementing this, we have taken special care to make sure we dont break anything for a user switching masters like:

STOP SLAVE;
CHANGE MASTER to <master_def>;
START SLAVE.

There are absolutely NO side-effects to worry you. Note that as stated before, CHANGE MASTER TO will not delete relay logs if one of the receiver or applier thread is running.

Try it out and give us your feedback. As always, we look forward to hearing from you to improve this feature. Enjoy :)

Friday, March 7, 2014

MySQL User Camp: Global Transaction Identifiers

The slde contains an introduction to the global transaction identifiers(GTIDs) in MySQL Replication. The new protocol at re-connect, skipping transactions with GTIDS, replication filters, purging logs, backup/restore ets are covered here.

Wednesday, December 18, 2013

CAP theorem and NoSQL databases

I was talking to a friend yesterday who said "RDBMS is going to go away, everyone uses NoSQL these days". This served as the motivation behind writing this post. No, I dont think that is the case by any stretch of imagination. This got me into reading more about NoSQL databases. Lets travel down this path to understand why the NoSQL databases are so popular today and how they started.

To get started on this, lets first try to understand the CAP theorem. There are three ingredients in the CAP theorem namely:

  1. Consistency- Having the same data across all the nodes in the cluster at any given instant of time.

  2. Availability- Being able to serve always. No downtime and least possible response time.

  3. Partition Tolerance- The system continues to serve even if some link is  broken and your cluster is broken into two or more parts. There could be a loss of a message, some node may crash, but you still want to be able to serve. 


Now the CAP theorem states that you can carry home only two out of these three. This is where the difference in RDBMS and NoSQL lies! Lets look at the three combinations we can form here[2]:

  1. CA - data is consistent between all nodes - as long as all nodes are online - and you can read/write from any node and be sure that the data is the sam.

  2. CP - data is consistent between all nodes, and maintains partition tolerance by becoming unavailable when a node goes down.

  3. AP - nodes remain online even if they can't communicate with each other and will resync data once the partition is resolved, but you aren't guaranteed that all nodes will have the same data (either during or after the partition)


Now look at the case of some popular NoSQL customers and then return back to see why NoSQL is good and applicable to them but RDBMS in my opinion will co-exist.
Lets talk of amazon.com first. Their business model is such that they want to be available all the time. They wouldn't want their site to be down or have a higher response time at any moment. So it is very essential for them to have the 'A' and 'P' attributes of the CAP theorem. They would rather give away the 'C' for it to an extent. Getting a regret from amazon.com saying we don't have this item although we showed you it was available earlier is not as bad as the site itself going down. So if there was one item and two people simultaneously put it into their carts, that could happen but given their business model they can have alternatives to save their customers of this situation. For instance they could have some extra items in the stock always.

Similarly when you think of facebook.com, suppose you post a picture on your wall. Its not a great deal if one of your friends can see that picture and the other will be able to see the picture a few moments later. Again, it doesn't care as much about consistency as it does to the availability.

Lets now think why was the cluster or a farm of servers needed after all. Its because everything you do on internet is being stored in a database. Google, facebook, amazon etc are examples who keep all this data for providing personalized search or recommendations etc. This huge amount of data in the order of petabytes or zetabytes can not be stored on one disk. To try to store all of them on one disk and replicate it to more such disks is a pain and that is why google chose to use a farm of of several servers with smaller disks. Traditional RDBMS was built to best serve on a single disk and that is why people with this huge data came up with BigTable, DynamoDB etc.

And as we near the end of this article, its importnat to have a look at some NoSQL databases. There are many out there which can be broadly divided into 4 categories:

  1. ColumnHBaseAccumulo

  2. DocumentMarkLogicMongoDBCouchbase

  3. Key-value : DynamoRiakRedisCacheProject Voldemort

  4. GraphNeo4JAllegroVirtuoso


Note that there isnt a concrete line between the 4 types. As an example, the document oriented databases and the key-value databases could resemble the other type to seom extent at times. So the boundaries are a little fuzzy. To conclude with, I would say NoSQL databases are popular and are good in certain circumstances, but when you come to something like say banking you really need ACID compliance and therefore the RDBMS. So in my opinion they will co-exist as they today.

References:

  1. http://en.wikipedia.org/wiki/CAP_theorem

  2. http://stackoverflow.com/questions/12346326/nosql-cap-theorem-availability-and-partition-tolerance

  3. http://stackoverflow.com/questions/16779348/does-the-cap-theorem-imply-that-acid-is-not-possible-for-distributed-databases

  4. http://en.wikipedia.org/wiki/NoSQL

  5. https://www.youtube.com/watch?v=qI_g07C_Q5I

Tuesday, December 10, 2013

A sneak peek into the Hadoop ecosystem

I jumped into the IT field for the love I had for it and the biggest distraction for me has always been trying to know something I know nothing about. Ever since some of my colleagues started working with Hadoop, I have been wanting to read about it. To follow with the same, the continous drive has been a feeing that there has to be something really nice about big data for everyone talking about it find themselves so cool. And finally the latest post on facebook by a professor saying:-
Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...

Well this quote is pretty famous by now and I must acknowledge this was something which pushed me into studying more about what this actually is, why is it so cool! Hopefully the next time I bump into some cool people I have something to talk about as well :D . I finally found some time and energy to study some of it this weekend. Here is a high level overview of the picture[1] I have in my mind right now:

hadoop_ecosystem_

HDFS

The base of this ecosystem is the Hadoop Distributed File system derived from the Google's whitepaper for Google File System(GFS). Lets take a simple example[2] to understand HDFS. Lets say we have a record containing the phone numbers of all the people in a city. You use say a 5 machines Hadoop Cluster to keep this data. Lets say we want to have a replication factor of 3 which means every chunk of data you have will have 2 extra backup copies stored on different machines. Further lets assume that you divide the hard disks on your 5 machines into 78 pieces. Lets say you store phone numbers of all the people whose names start with 'A' on one piece of a disk and keep its back up on the other two machines. Similary do that to all people's names starting with alphabets 'B'-'Z' In this way you organize your data on the 3 machines.

MapReduce: To generate a report from all the data, you would now need MapReduce codes. The MapReduce API is available opensource for use. But you will have to write some good Java codes to run the map jobes parallely on all those machines and get the results back (Reduce) to generate the final report.

Hive & Pig Frameworks: Writing MapReduce jobs isnt a piece of cake. So, Facebook made the Hive framework allow an easy way to do the same. Hive uses SQL-ish syntax to do things on the data lying on the Hadoop layer below. Pig is a similar framework built by Yahoo but it is more of a data flow language. In Hive a user can specify that data from two tables must be joined with an easy syntax like SQL, but not what join implementation to use. Pig is procedural and though you will have a little more to write there it does allows you the flexibility to specify an implementation of how you would like to join different chunks of data.

HBase is a NoSQL database that allows you to work the enormous amount of data stored on the Hadoop system. It is a column-oriented database management system. It is well suited for sparse data sets, which are common in many big data use cases. HBase does not support a structured query language like SQL. They are written in Java much like a typical MapReduce application.

ZooKeeper is a centralized infrastructure that helps synchronize across clusters. It maintains common objects needed in large cluster environments. Examples of these objects include configuration information hierarchical naming space, and so on. Applications can leverage these services to coordinate distributed processing across large clusters.

Sqoop can be used to import data from a RDBMS system (say MySQL/Oracle/MsSQL server) into HDFS or vice-versa.

Flume is the tool to gather data from various sources into your Hadoop system. Its mostly used for log data.

Closing Remarks:
To end with I would like to state that I am by no means an expert on big data. In fact I am a beginner just interested in knowing this uber-cool technology. And with this post all I aim is to start a discussion so that we can together start learning it bit by bit! So, if you find a mistake therein, please do help me learn it better :)

References:
1) http://www.youtube.com/watch?v=R-qjyEn3bjs
2) http://www-01.ibm.com/software/data/infosphere/hadoop/

Wednesday, November 20, 2013

Backtracking-I (Primer)

Recursion has always fascinated me for its simplicity. Whenever I come across a new recursive equation I had never known I feel happy about knowing it. But then there is a trade of with this simplicity for they say that in general recursive solutions are not as efficient as their iterative versions. Being a computer science geek and attached to algorithm programming, I am obsessed about efficiency of the code I write but recursion for its simplicity always allows me that exception. A road less traveled for me has been backtracking for backtracking involves trying all possibilities and it sounds awful in terms of complexities to say the least when you hear that first. But sometimes it is the only solution available or at least I see some classic problems like 8-queens problem which people solve using backtracking. It is for this reason that I thought it would be good to cover a post on backtracking to throw in some much deserved respect to this genre of algorithms. Lets try and learn backtracking approach to algorithms. Lets go through some simple problems to understand the idea. The first code we will write here is to generate all strings of n bits. For simplicity, we assume that we have a global array 'ar' available. Here is the code:
void generate_all(int n)
{
        if(n<1) printf("%s\n", ar);
        else{
                ar[n-1]='0';        //fix (n)th bit as '0'
                generate_all(n-1);  //generate all combinations for other n-1 positions.
                ar[n-1]='1';        //fix (n)th bit as '1'
                generate_all(n-1);  //generate all combinations for other n-1 positions.
        }
}
The output of this code is: 000 100 010 110 001 101 011 111 The code is exponential in complexity- O(2^n). Here n=3, so we have 8 strings in the output. Lets now generalize this code to print k-ary strings. void generate_all(int n, int k)
{
if(n<1) printf("%s\n", ar);
else{
for(int j=0;j<k;j++) //iterate over all k elements.
{
ar[n-1]='0'+j; //fix the (n)th position
generate_all(n-1,k); //generate all combinations for other n-1 positions.
}
}
}
[/sourcecode]

The output of this code is as follows:
000 100 200 300 010 110 210 310 020 120 220 320 030 130 230 330 001 101 201 301 011 111 211 311 021 121 221 321 031 131 231 331 002 102 202 302 012 112 212 312 022 122 222 322 032 132 232 332 003 103 203 303 013 113 213 313 023 123 223 323 033 133 233 333
This code is also exponential in nature- O(k^n). Here k=4 and n=3, so we have 64 strings in the output. Lets now add just a little more logic to that. We now write a code to print all permutations of a given string. The input string can be anything as opposed to first k characters starting with '0' in the earlier example.

[sourcecode language="cpp"]
#include
#include
#include
using namespace std;

void generate_all(int depth, char* permutation, int *used, char *original)
{
int length=strlen(original);
if(depth==length){ //base case
permutation[depth]='\0'; //so that we have an end marker for the string
printf("%s\n", permutation);
}
else{
for(int i=0;i<length;i++){
if(!used[i]){
used[i]=1; //Mark this position in "permutation" string used
permutation[depth]=original[i]; //fix the (i)th position
generate_all(depth+1, permutation, used, original); //generate all permutations for remaining (not marked used yet)positions
used[i]=0; //Prepare to backtrack
}
}
}
}

int main()
{
int used[]={0,0,0};
char * p;
char original[]="abc";
generate_all(0, p, used,original);
return 0;
}
[/sourcecode]

The output of the above code is:

abc acb bac bca cab cba

If we instead make our input string: "aaa", we get the output:

aaa aaa aaa aaa aaa aaa

I found the same question (last one) on geeksforgeeks but they solve it differently which is not very intuitive to me. To print only the unique strings in output, you can follow this code. This post is a primer and hence let us stop here. Try not to learn the exact questions and their solution but the way recursion unfolds and the way it backtracks again to look at the other parts of solution. Since we stop here with pretty basic stuff, let me make a promise to cover more backtracking codes in the coming posts. In the meantime if you are interested, have a look at the classic 8-queens problem and how backtracking is used to solve it.

Wednesday, November 13, 2013

Open Source India Conference: Multi-source Replication

This session was presented in Open source India 2013. The presentation covers MySQL multi-source replication which is released under MySQL Labs.

Friday, November 8, 2013

MySQL User Camp: MySQL Replication and Multi-threaded Slaves


The agenda of this talk was to introduce MySQL Replication and then follow it up with Multi-threaded slaves (MTS) support. The presentation introduces Multi threading slaves by database which is a part of MySQL-5.6 as well as multi-threading policy introduced in MySQL-5.7.2. Finally there is a brief coverage of the new replication monitoring tables to monitor MySQL Replication. These tables are part of MySQL Performance Schema.

Tuesday, October 15, 2013

Primality testing- II (Sieve of Atkins)

In the previous post on primality testing, we discussed successive division (brute force method) and sieve of Eratosthenes for checking if a number is prime and generating all primes below a given number respectively. Lets move to the Sieve of Atkins algorithm today which is a modified version of sieve of Eratosthenes. Sieve of Atkins is something I wouldn't suggest for a programming contest for it is difficult to understand and remember. To be frank, there are parts to it that I don't understand why they are right but it it works pretty fast when coupled with a good implementation. Note that if implemented badly, this could perform worse! The purpose behind putting this article on the blog is to spread the word about the existence of such an algorithm and that the code can be used when needed ;) Lets check out the algorithm now. Some excerpts first:

  1. The algorithm treats 2, 3 and 5 as special cases and just adds them to the set of primes to start with.

  2. Like Sieve of Eratosthenes, we start with a list of numbers we want to investigate. Suppose we want to find primes <=100, then we make a list for [5,1000] . As explained in (1), 2,3 and 5 are special cases and 4 is not a prime.

  3. The algorithm talks in terms of modulo-sixty remainders.

  4. All numbers with modulo-sixty remainder 1, 13, 17, 29, 37, 41, 49, or 53 have a modulo-four remainder of 1. These numbers are prime if and only if the number of solutions to 4x2 + y2 = n is odd and the number is squarefree. A square free integer is one which is not divisible by any perfect square other than 1.

  5. All numbers with modulo-sixty remainder 7, 19, 31, or 43 have a modulo-six remainder of 1. These numbers are prime if and only if the number of solutions to 3x2 + y2 = n is odd and the number is squarefree.

  6. All numbers with modulo-sixty remainder 11, 23, 47, or 59 have a modulo-twelve remainder of 11. These numbers are prime if and only if the number of solutions to 3x2 − y2 = n is odd and the number is squarefree.


The points (4-6) are not intuitive and hence this algorithm is a little difficult to understand. But, if you are interested, check out the proofs on Atkin's paper.  With these points in mind now, lets look at the following pseudocode from the Wikipedia article:
// arbitrary search limit
limit ← 1000000

// initialize the sieve
for i in [5, limit]: is_prime(i) ← false

// put in candidate primes:
// integers which have an odd number of
// representations by certain quadratic forms
for (x, y) in [1, √limit] × [1, √limit]:
n ← 4x²+y²
if (n ≤ limit) and (n mod 12 = 1 or n mod 12 = 5):
is_prime(n) ← ¬is_prime(n)
n ← 3x²+y²
if (n ≤ limit) and (n mod 12 = 7):
is_prime(n) ← ¬is_prime(n)
n ← 3x²-y²
if (x > y) and (n ≤ limit) and (n mod 12 = 11):
is_prime(n) ← ¬is_prime(n)

// eliminate composites by sieving
for n in [5, √limit]:
if is_prime(n):
// n is prime, omit multiples of its square; this is
// sufficient because composites which managed to get
// on the list cannot be square-free
is_prime(k) ← false, k ∈ {n², 2n², 3n², ..., limit}

print 2, 3
for n in [5, limit]:
if is_prime(n): print n

And finally lets convert this easy to understand algorithm into a python code:

[sourcecode language="Python"]
#!/usr/bin/python2.7 -tt

import sys
from math import sqrt, ceil, pow

def primes_below(limit): #sieve of Atkins
sqroot= int(ceil(sqrt(limit)))
is_prime=[False]*limit
primes = [2,3]

for x in xrange(sqroot+1):
for y in xrange(sqroot+1):
# n = 4*i^2 + j^2
n = 4*int(pow(x, 2)) + int(pow(y,2))
if n <= limit and (n % 12 == 1 or n % 12 == 5):
is_prime[n]= not is_prime[n]
# n = 3*i^2 + j^2
n = 3*int(pow(x, 2)) + int(pow(y,2))
if n <= limit and n % 12 == 7:
is_prime[n]= not is_prime[n]
# n = 3*i^2 - j^2
n = 3*int(pow(x, 2)) - int(pow(y,2))
if n <= limit and x > y and n % 12 == 11:
is_prime[n]= not is_prime[n]

for x in range(5, limit, 2):
if(is_prime[x] == True):
for y in xrange(x*x, limit, x):
is_prime[y]= False
primes.append(x);
print primes

def main():
primes_below(100)

if __name__=='__main__':
main()
[/sourcecode]

The C++ code for the algorithm can be found here. Enjoy!

Primality testing- I (brute force, Sieve of Eratosthenes)

If you are into algorithms, you would realize that Primality testing is a very important topic. In simple terms, Primality testing means finding if a number is prime or not. Lets discuss a few tricks for the same. There is actually a lot of number theory based on primality testing and it is impossible to cover all of them here. Nevertheless lets discuss some of them. This post starts with a simple brute-force algorithm and introduces Sieve of Eratosthenes. With these simple things covered, we will try and discuss more of primality testing in the following posts. Lets get started with the brute force approach!

NAIVE ALGORITHM:

Lets say the given number is n. The simplest(brute-force) way to find if a number is prime or not is to check if the number is divisible by any number in the range 2 to n-1. This follows from the definition of a prime number and should be easy to understand. At this moment, lets revisit a basic school grade mathematics property which says:
If a number, say n, is divisible by an integer k, then either k or n/k has to be <= sqrt(n)

This means you dont need to check if n is divisible by any of the numbers in [0, n-1], rather you can just check for [0, sqrt(n)]. Well, Give yourself a pat on the back because that seems like a good enough optimization. If you were making 10000 checks in the former case, you have reduced it to just 100 iterations with the addition of this property.  Lets have a look at a small python code to illustrate this (source: github):

[sourcecode language="python"]

def isPrime(n):
maxPossibleFactor = int( floor( sqrt(n) ) )
possibleFactors = range( 2, maxPossibleFactor+1 )
divisibleFactors = [ n % i == 0 for i in possibleFactors ]
return not any( divisibleFactors )

[/sourcecode]

That was easy. Right?

SIEVE OF ERAOSTHENES:

Lets move ahead and look at a slightly different question: Identify all the prime numbers below 1000. A naive way to do this is to call the above python function for [0-1000].  Sounds like bad... Can we do better? Yes, we can. Turns out there is a well known algorithm for doing this- Sieve of Eratosthenes. Lets see how this algorithm works (The explanation below is  copied from here):
Make a list of all the integers less than or equal to n (and greater than one). Strike out the multiples of all primes less than or equal to the square root of n, then the numbers that are left are the primes.

For example, to find all the primes less than or equal to 30, first list the numbers from 2 to 30.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The first number 2 is prime, so keep it (we will color it green) and cross out its multiples (we will color them red), so the red numbers are not prime.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The first number left (still black) is 3, so it is the first odd prime. Keep it and cross out all of its multiples. We know that all multiples less than 9 (i.e. 6) will already have been crossed out, so we can start crossing out at 32=9.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Now the first number left (still black) is 5, the second odd prime. So keep it also and cross out all of its multiples (all multiples less than 52=25 have already been crossed out, and in fact 25 is the only multiple not yet crossed out).
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The next number left, 7, is larger than the square root of 30, so there are no multiples of 7 to cross off that haven't already been crossed off (14 and 28 by 2, and 21 by 3), and therefore the sieve is complete. Therefore all of the numbers left are primes: {2, 3, 5, 7, 11, 13, 17, 19, 23, 29}. Notice we just found these primes without dividing.

Now, lets look at a fun animation to illustrate the algorithm (source: wikipedia):

Sieve_of_Eratosthenes_animation
With the algorithm clear now, lets look at a python code (source: stackoverflow).

[sourcecode language="python"]
def primes_sieve2(limit):
a = [True] * limit # Initialize the primality list
a[0] = a[1] = False

for (i, isprime) in enumerate(a):
if isprime:
for n in xrange(i*i, limit, i): # Mark factors non-prime
a[n] = False
[/sourcecode]

If you have a trouble understanding some part of  this post or any other related question, please drop a comment. And by the time, we have the next post ready, you may try some questions related to prime numbers at http://projecteuler.net/problems. Happy coding :)

Sunday, September 22, 2013

MySQL 5.7: Monitoring Replication with the Performance Schema tables

In the previous post, we had a look at the Performance Schema tables to monitor MySQL replication. We discussed why these were needed, what advantages they have to offer and what each table stands for. We also did a small exercise to see how one can select a particular table given the data that is needed. We will now dive deeper and look at the individual tables to understand what fields they are made of. We will also go through different scenarios to see how the different pieces of data can be used.

Note that these tables report the status/configuration of a replication slave. So, these are empty at the master server or at a standalone server. All the SELECTs we will talk of, are done at the slave. Lets take the tables one by one and go through them.

REPLICATION_CONNECTION_CONFIGURATION:

This table shows the configuration parameters used by the slave server for connecting to the master server. Lets set up replication with the following CHANGE MASTER TO command executed on a slave server. The  CHANGE MASTER TO command adds/changes the parameters that the slave server uses for connecting to the master server. To see the definition of all the fields in this table, please visit our official documentation.

mysql> change master to master_host='127.0.0.1', master_port=13000 ,
master_user='rpluser', master_password='secret', master_connect_retry=40, 
master_retry_count=10, master_ssl=1, master_ssl_verify_server_cert=1, 
master_auto_position=1; 
Query OK, 0 rows affected, 2 warnings (0.51 sec)

We can now look up the connection parameters set in our table for replication connection configuration. Lets do a SELECT * on this table to see what we get:

mysql> select * from performance_schema.replication_connection_configuration\G
*************************** 1. row ***************************
                         HOST: 127.0.0.1
                         PORT: 13000
                         USER: rpluser
            NETWORK_INTERFACE:
                AUTO_POSITION: 1
                  SSL_ALLOWED: YES
                  SSL_CA_FILE:
                  SSL_CA_PATH:
              SSL_CERTIFICATE:
                   SSL_CIPHER:
                      SSL_KEY:
SSL_VERIFY_SERVER_CERTIFICATE: YES
                 SSL_CRL_FILE:
                 SSL_CRL_PATH:
    CONNECTION_RETRY_INTERVAL: 40
       CONNECTION_RETRY_COUNT: 10
1 row in set (0.00 sec)

You can see that the changes made by CHANGE MASTER TO command can be seen in this table. These values are preserved here until the next CHANGE MASTER TO or RESET SLAVE ALL command. A row in this table represents a connection between the slave and its master. Since we have only one master here, we see only one row. If MySQL supported multi-source, then this table would have information about multiple sources- one row per source.

Note that a START SLAVE is not done yet. The CHANGE MASTER TO command makes changes to the internal data structures regarding the connection parameters (if you know the internals, the "active_mi" variable is what I am talking about) and this table just picks up values from this data structure. So, the values shown in the table just reflects the state of the this data structure at the moment the SELECT was done.

Lets now move to the next table "replication_connection_status".

REPLICATION_CONNECTION_STATUS:

This table shows the current status of connection between the slave and its master. Technically, it talks about the IO thread's status. To view the definition of all the fields in this table, please visit our official documentation. Now, we need a START SLAVE because this command establishes the connection between slave and its master. So, lets execute a START SLAVE/ START SLAVE IO_THREAD command followed by a SELECT* from "replication_connection_status" table.

mysql> start slave; Query OK, 0 rows affected (0.05 sec)

mysql> select * from performance_schema.replication_connection_status\G
*************************** 1. row ***************************
             SOURCE_UUID: 1766a057-10cd-11e3-b06a-0021cc72bfc3
               THREAD_ID: 2
           SERVICE_STATE: ON
RECEIVED_TRANSACTION_SET:
       LAST_ERROR_NUMBER: 0
      LAST_ERROR_MESSAGE:
    LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00

So, we can see from here that, at the moment SELECT was done:

  1. The slave was reading from a master with UUID= 1766a057-10cd-11e3-b06a-0021cc72bfc3,
  2. The thread that handles this connection (IO thread) has an id=2. The thread_id field is very useful if you want to join this table with other Performance Schema/Information Schema tables. As an example, you can join this table with the table performance_schema.threads to extract a lot more information about this connection (IO thread). Likewise, there is a lot more  thread-wise statistics you can collect with the other existing Performance Schema tables. Note that there is a bug here  but thankfully there is an easy workaround.
  3. Further, we see that our connection (the IO thread) is ready for service (SERVICE_STATE: ON). 
Note that the service state shows one of (ON/OFF/CONNECTING):
  • ON: meaning ready to serve i.e. if there is an event that is executed on master, you can assume      that it will be read by the slave via this connection.
  • OFF: meaning the connection (IO thread) is not serving i.e. it is not ready to read from the master or killed.
  • CONNECTING: meaning the connection (IO thread)is not established yet but the slave is trying to connect.
It is important to note here that the names and semantics of thread_id and service_state is extended to all the status tables and they all mean exactly the same. Hence, going forward lets skip explaining these two columns in the following tables.

Lets now execute a couple of queries on the master server and see if we can get the set of received transactions on our slave server.

mysql>create table test.t(a int primary key); Query OK, 0 rows affected (0.61 sec)
mysql>insert into test.t values(1); Query OK, 1 row affected (0.06 sec)

Lets turn back to our slave server and see what the field "received_transaction_set" has.

mysql> select received_transaction_set from performance_schema.replication_connection_status\G
*************************** 1. row ***************************
RECEIVED_TRANSACTION_SET: 1766a057-10cd-11e3-b06a-0021cc72bfc3:1-4
1 row in set (0.00 sec)

Lets now cause an error in the connection (IO thread) and look at the error related fields in this table. To cause an error in the connection, we will provide a wrong password in our CHANGE MASTER TO command.


mysql>stop slave; Query OK, 0 rows affected (0.10 sec)mysql>change master to master_password='abcd'; Query OK, 0 rows affected, 2 warnings (0.29 sec)
mysql> start slave;Query OK, 0 rows affected (0.05 sec)


mysql> select last_error_number, last_error_message, last_error_timestamp from 
performance_schema.replication_connection_status\G
*************************** 1. row ***************************                   
  LAST_ERROR_NUMBER: 1045                                                         
  LAST_ERROR_MESSAGE: error connecting to master 'rpluser@127.0.0.1:13000' - retry-time: 40 retries: 2                                                           
LAST_ERROR_TIMESTAMP: 0000-00-00  
1 row in set (0.00 sec)


The names and semantics of the error related fields(number, message and timestamp) are again exactly same across all the tables, wherever applicable. Hence, going ahead, lets skip monitoring error related fields in the following tables.

Lets now clean-up the error we caused to move ahead and work with other tables.

mysql> stop slave; Query OK, 0 rows affected (0.06 sec)mysql> change master to master_password='secret'; Query OK, 0 rows affected, 2 warnings (0.25 sec)mysql> start slave; Query OK, 0 rows affected (0.08 sec)

mysql> select last_error_number, last_error_message, last_error_timestamp from 
performance_schema.replication_connection_status\G
*************************** 1. row ***************************                   
  LAST_ERROR_NUMBER: 0                                                         
  LAST_ERROR_MESSAGE:                                                           
LAST_ERROR_TIMESTAMP: 0000-00-00  
1 row in set (0.00 sec) 

Great! We are done with monitoring the connection specific parameters between a slave and its master. Lets now move to the other half of replication module- execution of events received from the master on the slave server.

REPLICATION_EXECUTE_CONFIGURATION/STATUS:

This table shows the configuration parameters that affect execution of transactions by the slave server. Parameters stored in the table can be changed with the CHANGE MASTER TO command. Lets setup a delayed replication now and see how the same can be read from the tables replication execute configuration.


mysql> stop slave; Query OK, 0 rows affected (0.11 sec)mysql> change master to master_delay=150; Query OK, 0 rows affected (0.29 sec)mysql> start slave; Query OK, 0 rows affected (0.06 sec)

mysql> select desired_delay from performance_schema.replication_execute_configuration\G
*************************** 1. row ***************************
DESIRED_DELAY: 150
1 row in set (0.00 sec)

Lets now execute an event at our master to see if the event is executed at the slave or not.

mysql> insert into test.t values(1); Query OK, 1 row affected (0.06 sec)

Lets move to slave and make sure our slave has not elapsed the 150 second delay yet.

mysql> select remaining_delay from performance_schema.replication_execute_status\G
*************************** 1. row ***************************
REMAINING_DELAY: 147
1 row in set (0.00 sec)

Now we should be able to see that although the insert(2) is received at slave it is not executed yet. The last time we saw our received_transaction_set (before insert(2)), it was 1766a057-10cd-11e3-b06a-0021cc72bfc3:1-4. So insert(2) should be assigned a GTID 1766a057-10cd-11e3-b06a-0021cc72bfc3:5.

mysql> select received_transaction_set from performance_schema.replication_connection_status\G
*************************** 1. row ***************************
RECEIVED_TRANSACTION_SET: 1766a057-10cd-11e3-b06a-0021cc72bfc3:5
1 row in set (0.00 sec)

Ok, we have received the event. Now lets see if this insert(2) with GTID 1766a057-10cd-11e3-b06a-0021cc72bfc3:5 is executed yet.

mysql> select @@global.gtid_executed\G
*************************** 1. row ***************************
@@global.gtid_executed: 1766a057-10cd-11e3-b06a-0021cc72bfc3:1-4
1 row in set (0.00 sec)

Lets now wait until, our delay is elapsed.

mysql> select remaining_delay from performance_schema.replication_execute_status\G
*************************** 1. row ***************************
REMAINING_DELAY: 0
1 row in set (0.00 sec)

And check if the insert(2) has been executed.

mysql> select @@global.gtid_executed\G
*************************** 1. row ***************************
@@global.gtid_executed: 1766a057-10cd-11e3-b06a-0021cc72bfc3:1-5
1 row in set (0.00 sec)

Executed. Great!

Before going ahead to have a look at the other goodies in store, lets revise a couple of things:
  • SQL THREAD: Executes the events received from the master at slave.
  • MULTI-THREADED SLAVE (MTS): With only one thread (SQL thread) reading and applying events that are applied by multiple clients concurrently on the master, the slave starts lagging behind the master. In this case, one can chose to opt for MTS. Simply put, we have a buffer (relaylog) on slave server that stores the events executed on the master server. The coordinator thread reads the relaylog and assigns these to worker threads which execute the events assigned to them parallely on the slave. The coordinator thread is the scheduler and the worker threads are responsible for executing the events.

REPLICATION_EXECUTE_STATUS_BY_COORDINATOR:

This table is used in two ways depending on whether the slave is operating with one applier (MTS disabled) or multiple appliers executing events in parallel(MTS enabled). When operating with one applier, this table reports the status of this applier. When multiple appliers, the same table reports the status of the scheduler (coordinator thread). Lets explore the single applier use case.

mysql> select * from performance_schema.replication_execute_status_by_coordinator\G
*************************** 1. row ***************************
           THREAD_ID: 13
       SERVICE_STATE: ON
   LAST_ERROR_NUMBER: 0
  LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
1 row in set (0.00 sec)

We see that the applier's (SQL thread) id is 13 and it is alive (SERVICE_STATE: ON). For now there are only two states that we show- ON/OFF. The semantics of error fields are exactly similar to the explanations earlier. Hence we skip them here.

The Multiple applier case (MTS enabled) is similar except that this table will show the scheduler's (coordinator) thread id. Hence we will skip the explanations and move to the next table. To read more about this table, please visit our official documentation for this table.

REPLICATION_EXECUTE_STATUS_BY_WORKER (Single Applier Case):

When the slave server is operating in the single applier mode, lets see what the table replication_execute_status_by_worker shows us. Conceptually, we don't have a scheduler and multiple appliers. so the table should be empty.

mysql> select * from performance_schema.replication_execute_status_by_worker\G
Empty set (0.00 sec)

And it is empty :)

REPLICATION_EXECUTE_STATUS_BY_WORKER (Multiple Appliers Case):

Lets now set up our slave to use multiple appliers (MTS enabled) and then see how we can monitor the replication status concerned with our scheduler and appliers (coordinator and worker threads). Lets call our multiple appliers "worker".

mysql> stop slave; Query OK, 0 rows affected (0.10 sec)
mysql> SET GLOBAL slave_parallel_workers=2; Query OK, 0 rows affected (0.00 sec)

Note that a START SLAVE is necessary to have the workers ready for action. Just setting slave_parallel_workers=2, doesn’t spawn the 2 workers.

mysql> start slave; Query OK, 0 rows affected, 1 warning (0.14 sec)

mysql> select * from performance_schema.replication_execute_status_by_worker\G
*************************** 1. row ***************************
            WORKER_ID: 0
            THREAD_ID: 16
        SERVICE_STATE: ON
LAST_SEEN_TRANSACTION:
    LAST_ERROR_NUMBER: 0
   LAST_ERROR_MESSAGE:
 LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
            WORKER_ID: 1
            THREAD_ID: 17
        SERVICE_STATE: ON
LAST_SEEN_TRANSACTION:
    LAST_ERROR_NUMBER: 0
   LAST_ERROR_MESSAGE:
 LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
2 rows in set (0.00 sec)

So, we see that we have two rows to represent the two workers. The worker_id field here is same as the id field in slave_worker_info table, so you can again join the two tables to know more about a worker thread. The service_state(ON/OFF) has nothing new to add.
The interesting parts here are:
  1. You get the per worker error. So, if two of your workers have errored out in a group, you can identify both the errors as opposed to only one through SHOW SLAVE STATUS.
  2. last_seen_transaction is the new field we have added to help detect MTS failures better. This helps you find the GTID of all the transactions being executed by the all the workers at the moment the error happened. These values are preserved in case of an error and in the idle state of the worker after executing the transactions.
To read more about this table and field-wise descriptions, please visit our official documentation for this table.

CONCLUDING REMARKS:

The goal of this post was to help you familiarize with the Performance Schema tables relating to replication. Please try out this new interface for monitoring MySQL replication and let us know your feedback. To read more on these Performance Schema tables, you can visit the blog on multisource replication by Rith. These Performance Schema tables look really powerful with multisource replication. The motive behind this feature was to make it easier to monitor the health and performance of MySQL replication. We hope this new interface to monitor MySQL replication makes our user's life easier.

Finally a "software developer" !

It was the summer of 2007. I was a a class X student whose only interaction with a computer had been having played Mario once kind of 5 years ago. I was offered to choose between one of biology and computer science. Mathematics or Hindi Language being the other choice I had to make. Though I used to write poems, stories, give lecture in Hindi (believe me, I was quite famous in school for these), Mathematics was my darling (Alas, this changed at college!) and it was easy to chose Mathematics. The choice between Biology and Computer science was a difficult one and it ended up in the favor of computer science just because I was terrible at drawing a heart or a kidney. Some of my friends would remember that I took 3 hours to draw a heart in my biology class ;)

So I ended up being a computer science geek and thankfully so, for I am thoroughly enjoying the experience every single day. The next two years the only interaction I had with a computer was the blue turboC screen, yes the horrible old compiler but I have very fond memories of it, so I still don't call it stupid :) The OOPs concepts two years later was better than it is today, thanks to Nancy Madam. Ma'm, if you are reading this, you deserve a big thanks. And the choice of branch when I got into engineering was a no brainer.

Thanks to all those, whose names appear in the acknowledgement section of this blog and more, I am a proud open source contributor working for the most popular database on web- MySQL. I am a part of MySQL replication development team and working with these guys here is a looooooot of fun. Being a part of MySQL Replication for an year, I feel the best part has been the amount of learning that has come talking to the MySQLers, not to forget the NCGs 2012 (Well, we are a bunch of people, new college grads who joined MySQL in July 2012). NCGs you rock \o/.

Lets now return to the title and motivation for this post. It was a dream since 2007 when I had no interaction with computers- to develop a lot of features, to contribute to the this golden age of technology as much as I can. And the first tiny (well, tiny in terms of years of contributing to software development which I aim at) step into this world, my first design and implementation for a MySQL replication feature is now released. And while people check out the blogs on this feature, browse the source I have written, I will sit and enjoy motivating myself to do a lot more into this huge IT world. I am a software developer by choice, nothing very very big as of now, but I am confident I will make it really really big one day. For now, for the first time in my life, I feel I have become a software developer and its time to start on the journey.

And finally some advertisement :D. Please check out:

  1. My blog about this feature

  2. The official documentation

  3. Download mysql-5.7.2, try it out & let us know your feedback and

  4. The source code too, if you feel interested enough :)


[UPDATE]:

My second blog on this feature- http://shivjijha.blogspot.com/2013/09/Monitoring-Replication-with-the-NEW-performance-schema-tables.html

Enjoy!

Saturday, September 21, 2013

MySQL 5.7: Introducing the Performance Schema tables to monitor Replication


MySQL-5.6 was our best release ever and we are happy to see people praising it. This motivates us to work even harder for our next release. MySQL-5.7.2 DMR is out and we have already got a number of MySQL replication features. Below is a list of these features:

    - mysqlbinlog idempotent mode
    - mysqlbinlog --rewrite-db option
    - Dump thread does not take binary log lock
    - Lossless semisync
    - intra-schema multi-threaded slave
    - Performance Schema tables for replication monitoring (this post :) )

We want to make MySQL-5.7 huge and these are tiny steps towards the same.

This post introduces the Performance Schema tables to monitor MySQL replication. You can find the official documentation for the same here. This post gives a brief overview of these Performance Schema tables in an attempt to prepare you to dive deeper into the details. Please follow my second blog post on this topic to know the details on the individual tables.

Why Performance Schema Tables:

Replication is a delicate module of MySQL and people want to know its status frequently. To monitor the MySQL replication health, we have been using the SHOW SLAVE STATUS command for long. But replication is growing fast and this static command is not able to match up to people's expectations. SHOW SLAVE STATUS does not scale in the sense that there are multiple parts of the slave server: receiver part, applier part, intermediate queues, etc, in cases, multiple instances of each part. It has 54 fields, as of now, interleaving different information together. It has now reached such a point where we foresee that not having an SQL interface to query exactly what is required from the replication status would make monitoring tedious. So, we decided to have tables to monitor replication and we have put all our tables in the performance_schema database.

The motivation behind having tables is:

  • To make it easier to access exactly what is required, through an SQL interface,
  • Pieces of data can be assigned to variables and thus used in stored procedures etc,
  • Easier testing with SELECT item from Performance Schema tables,
  • Split the logically unrelated information into different tables,
  • Cross-reference monitoring data seamlessly by joining with other Performance Schema tables, Information_Schema tables etc,
  • Easier to extend and
  • More flexibility to accommodate a lot of replication information but still be organized and easy to use.

Additionally, we noted that SHOW SLAVE STATUS used a lot of technical jargons(words like IO, SQL, Relay_log etc). In order to make it easier to monitor replication, we decided to hide these implementation specific details so that it is easier to understand the names. We have tried our best to make it convenient for everyone to understand the names of the tables and the fields.

Why 6 tables? What does each stand for:

The idea is to come up with a better organized and an easy to extend interface. To start with, we have split the information under SHOW SLAVE STATUS into different parts based on:

  1. Connection information or execute information
  2. In each of (1), we further have configuration and status related fields put into different tables.

Based on the above classifications, we have got four tables:

    a)    replication_connection_configuration,
    b)    replication_connection_status,
    c)    replication_execute_configuration and
    d)    replication_execute_status

Note that all replication applier module status information under (d) would become confusing again and it makes more sense to have them split based on overall stats, coordinator's status or worker's status. So, we have two more tables namely,

    e)    replication_execute_status_by_coordinator and
    f)     replication_execute_status_by_worker.

Note that (d) is responsible for showing the overall status of the applier module in replication, (e) relates to the coordinator thread and (f) refers to the worker thread.

See the tree diagram below for the basis of division of data into different tables as discussed above:


To make it easier to select a table when you have a configuration/status parameter in mind, just ask yourself the following questions:

  • Is it relating to the  connection between the slave and its master?
  • If yes, you have narrowed down your scope to only (a) and (b).
  • If not, the others (c, d, e or f).
  • Suppose, the data you are looking for is relating to the connection (not status), now just ask yourself if it is a configuration parameter or relating to the status of connection between slave and its master. And there you are- you know which table to look at.
  • If the data you are looking for is relating to the execution of events received at your slave (c, d, e, f) are the tables you have. Now ask yourself the same question. Is this data you need relating to a configuration parameter (c)or relating to the status of execution of events at the slave(d, e or f).

For those who know the internals and have been using SHOW SLAVE STATUS, the names might look new and it could take a little time to get used to them. So, lets now see how these tables relate to the different replication threads. If you are not familiar with the threads and buffers, you don't really need to understand them. This new replication monitoring interface is designed to make sure you don't have to and hence you can skip this section. If you are familiar with the replication threads and want to understand the tables in terms of threads, you can use the table below to map these tables to the replication thread you want to monitor. But before we move ahead lets revise a couple of things:

  • IO THREAD: Responsible for the connection between master and slave, also gets queries   executed on the master onto the slave.
  • SQL THREAD: Executes the events received from the master at slave.
  • MULTI-THREADED SLAVE (MTS): With only one thread (SQL thread) reading and applying events that are applied by multiple clients concurrently on the master, the slave starts lagging behind the master. In this case, one can chose to opt for MTS. Simply put, we have a buffer (relaylog) on slave server that stores the events executed on the master server. The coordinator thread reads the relaylog and assigns these to worker threads which execute the events assigned to them parallely on the slave. The coordinator thread is the scheduler and the worker threads are responsible for executing the events.

With the background set, lets now look at the table below to understand the mapping between the Performance Schema tables and the replication threads they monitor.













To end with, it may interest you to note that these Performance Schema tables not only give you an easier way to monitor replication but also allow you to easily obtain all the information available about replication threads and coordinates through a join with the other Performance Schema tables, Information Schema tables, mysql.slave_worker_info etc.

Go check it out and let us know if it allows you to know more about the replication status. We want to have your valuable feedback to improve this and add a lot more information to this interface.

Want to read more and explore the fields, how-to, what fields etc ? Here is the deep-dive in the next post :) Enjoy!