Posts

Showing posts from April, 2015

TFIDF code

TFIDF is 3 pair of map-reduce  program TF&DF can be achieved from simple map-reduce tasks, here posted the third map reduce task to find out the TFIDF final deliverables.   Custom  Key public class Key implements WritableComparable<Key>{ Text word; IntWritable type; public Key() { word = new Text(); type = new IntWritable(); } public Text getWord() { return word; } public void setWord(Text word) { this.word = word; } public IntWritable getType() { return type; } public void setType(IntWritable type) { this.type = type; } @Override public void readFields(DataInput arg0) throws IOException { // TODO Auto-generated method stub this.word.readFields(arg0); this.type.readFields(arg0); } @Override public void write(DataOutput arg0) throws IOException { // TODO Auto-generated method stub this.word.write(arg0); this.type.write(arg0); } @...

Reduce Side Join (Secondary Sorting) program code.

Custom  Key public class EmployeeKey implements WritableComparable<EmployeeKey>{ Text StateID; IntWritable type; public Text getStateID() { return StateID; } public void setStateID(Text stateID) { StateID = stateID; } public IntWritable getType() { return type; } public void setType(IntWritable type) { this.type = type; } public EmployeeKey() { // TODO Auto-generated constructor stub StateID = new Text(); type =new IntWritable(); } @Override public void readFields(DataInput in) throws IOException { // TODO Auto-generated method stub this.StateID.readFields(in); this.type.readFields(in); } @Override public void write(DataOutput out) throws IOException { // TODO Auto-generated method stub this.StateID.write(out); this.type.write(out); } @Override public int compareTo(EmployeeKey o) { // TODO Auto-generated method stub int cmp = 0; cmp = this.StateID.compareTo(o.getStateID()); if(cmp =...

Failover and fencing

1.The transition from the active namenode to the standby is managed by a new entity in the system called the failover controller.  2.Failover controllers are pluggable, but the first implementation uses ZooKeeper to ensure that only one namenode is active.  3.Each namenode runs a lightweight failover controller process whose job it is to monitor its namenode for failures (using a simple heartbeating mechanism) and trigger a failover should a namenode fail. 4.Failover may also be initiated manually by an adminstrator, in the case of routine maintenance, for example. This is known as a graceful failover, since the failover controller arranges an orderly transition for both namenodes to switch roles. 5.In the case of an ungraceful failover, however, it is impossible to be sure that the failed namenode has stopped running. For example, a slow network or a network partition can trigger a failover transition, even though the previously active namenode is still runn...

Secondary NameNode check-pointing process

Image
Secondary namenode, whose purpose is to produce checkpoints of the primary’s in-memory filesystem metadata. The check pointing process proceeds as follows, 1. The secondary asks the primary to roll its edits file, so new edits go to a new file. 2. The secondary retrieves fsimage and edits from the primary (using HTTP GET). 3. The secondary loads fsimage into memory, applies each operation from edits, then creates a new consolidated fsimage file. 4. The secondary sends the new fsimage back to the primary (using HTTP POST). 5. The primary replaces the old fsimage with the new one from the secondary, and the old edits file with the new one it started in step 1. It also updates the fstime file to record the time that the checkpoint was taken. At the end of the process, the primary has an up-to-date fsimage file and a shorter edits file (it is not necessarily empty, as it may have received some edits while the checkpoint was being taken). It is possible for an administrator...