Linux internals and network programming: 2016

hvn-network

Tuesday, November 1, 2016

Simple Factory Pattern, Factory Method Pattern, Abstract Factory Pattern

Factory pattern is one of the most used pattern in Object Oriented Design. In this post, i will present basic concepts about Simple Factory Pattern, Factory Method Pattern and Abstract Factory Pattern. Then i will show the differences among them and when we use each of them.

1. Problem description

Imagine that you was assigned to develop an application to sell mobile phones. At first, our shop sell brand new Apple phones and brand new Samsung phones only. We create a Phone abstract class inherited by concrete classes. Name and pack methods inply the phone name and the phone is packed respectively. A client (main function) create phone based on user's input.

Class diagram for the example

The C++ code for the above description can be given as follows.

#include <iostream>
#include <string>
using namespace std;
class Phone {
public:
 virtual void name() = 0;
 void pack() {
  cout << "This phone is packed and sent to you soon" << endl;
 }
};

class Samsung : public Phone {
public:
 void name() {

  cout << "This is a brand new Samsung phone" << endl;
 }
};

class Apple : public Phone {
public:
 void name() {
  cout << "This is a brand new Iphone" << endl;
 }
};

/* Client code */
int main() {
 string userInput = "samsung";  //simulate user input
 Phone *phone;
 if (userInput == "samsung") {
  phone = new Samsung;
 } else {
  phone =  new Apple;
 }
 phone ->name();
 phone ->pack();
 return 0;
}

Assume that an user wants to buy a Samsung phone (we simulate this action with string userInput = "samsung";) , the output will be:

"This is a brand new Samsung phone

This phone is packed and sent to you soon".

In small and simple problem, this design work fine. But later, the shop decides to sell HTC phone. We have to modify client code (User interface). This breaks the Open Closed Principle, which says software entities (class, function, ...) should be open for extension and closed for modification).

Lesson learnt, changes are obvious part of software development, but system should be developed to adapted with changes without modifying the tested sections.

2. Simple Factory Pattern.

We encapsulate the code sections should be changed (because of new phone type addition). So this modification does not affect the tested part of our software. To do that, we create a Simple Factory class to make a new Phone. Client UI code is unchanged even if new products are added.

Class diagram of simple factory pattern in theory

Class diagram of simple factory pattern for the example

The C++ code for this design is given as follows.

#include <iostream>
#include <string>
using namespace std;
class Phone {
public:
 virtual void name() = 0;
 void pack() {
  cout << "This phone is packed and sent to you soon" << endl;
 }
};


class Samsung : public Phone {
public:
 void name() {
  cout << "This is a brand new Samsung phone" << endl;
 }
};


class Apple : public Phone {
public:
 void name() {
  cout << "This is a brand new Iphone" << endl;
 }
};

class PhoneFactory {
public:
 Phone* makePhone(string userInput) {
  Phone *phone;
  if (userInput == "samsung") {
   phone = new Samsung;
  } else {
   phone =  new Apple;
  }
  phone ->name();
  phone ->pack();
  return phone;
 }
};

int main() {
 string userInput = "samsung";        /*simulate user input*/
 PhoneFactory *phoneFactory = new PhoneFactory;
 Phone *phone = phoneFactory->makePhone(userInput);
 return 0;

}

The shop operation has increased a lot and the boss wants to sell like new smart phones. Then, the code in PhoneFactory class can be modified as follows.

class PhoneFactory {
public:
 Phone* makePhone(string userInput) {
  Phone *phone;
  if (userInput == "samsung") {
   phone = new Samsung;
  } else if(userInput == "apple") {
   phone =  new Apple;
  } else if(userInput == "like new samsung){
   phone = new LikeNewSamsung;
  } else {
   phone = new LikeNewApple;
  }
  phone ->name();
  phone ->pack();
  return phone;
 }
};

It actually works, but breaks Single Responsibility Principle saying a class should have only one reason to change. Here,. the PhoneFactory class will be changed

When others brand new phones are added
When others like new phones are added

To solve this problem, we may think about are two separate Factory classes, one for brand new phones, the other for like new phones.

class BrandNewPhoneFactory {
public:
 Phone* makePhone(string userInput) {
  Phone *phone;
  if (userInput == "samsung") {
   phone = new Samsung;
  } else {
   phone =  new Apple;
  }
  phone ->name();
  phone ->pack();
  return phone;
 }
};
class LikeNewPhoneFactory{
  Phone *phone;
  if (userInput == "samsung") {
   phone = new LikeNewSamsung;
  } else {
   phone =  new LikeNewApple;
  }
  phone ->name();
  phone ->pack();
  return phone;
 }
};

This seems to work, but the two factory classes are independent. We cannot make sure every factory class follow the same rule.

3. Factory method pattern

To solve this problem, we use the factory method pattern, which let the abstract factory class encapsulate common functionalities and the derived classes will override the abstract method.

Class diagram of factory method pattern in theory

Class diagram of factory method pattern for the example

For now, if a new phone group is add. a new factory will be derived from BasePhoneFactory, encapsulating creation of all concrete classes.

#include <iostream>
#include <string>
using namespace std;

class Phone {
public:
 virtual void name() = 0;
 void pack() {
  cout << "This phone is packed and sent to you soon" << endl;
 }
};


class Samsung : public Phone {
public:
 void name() {
  cout << "This is a brand new Samsung phone" << endl;
 }
};

class Apple : public Phone {
public:
 void name() {
  cout << "This is a brand new Iphone" << endl;
 }
};

class LikeNewSamsung: public Phone {
public:
 void name() {
  cout << "This is a like new Samsung phone"<< endl;
 }
};

class LikeNewApple: public Phone {
public:
 void name() {
  cout << "This is a like new Apple phone"<< endl;
 }
};


class PhoneFactory {
public:
 Phone *makePhone(string userInput) {
  Phone *phone = this -> getPhone(userInput);
  phone->name();
  phone->pack();
  return phone;
 }
 virtual Phone *getPhone(string userInput) = 0;
};

class BrandNewPhoneFactory : public PhoneFactory{
public:
 Phone *getPhone(string userInput) {
  if (userInput == "samsung") {
   return new Samsung;
  } else {
   return  new Apple;
  }
 }
};

class LikeNewPhoneFactory : public PhoneFactory{
public:
 Phone *getPhone(string userInput) {
  if (userInput == "samsung") {
   return new LikeNewSamsung;
  } else {
   return new LikeNewApple;
  }
 }
};


int main() {
 string userInput = "samsung";        /*simulate user input*/
 PhoneFactory *phoneFactory = new LikeNewPhoneFactory;
 Phone *phone = phoneFactory->makePhone(userInput);
 return 0;
}

4. Factory Method

Different from simple factory and factory method, which create one product at a time, factory method pattern is used when we want to create a family of components (usually to make a product by them). For example, we have to develop a software to assemble phones based on requirements of customers. For simplicity, we assume that phones are made from two main components Board and Monitor. We have three phone types:

Expensive phone: Created from expensive board and expensive monitor
Medium phone: Created from expensive board and cheap monitor
Cheap phone: Created from cheap board and cheap monitor

We define a AbstractFactory as a contract of derived class with two abstract methods: makeBoard and makeMonitor. Three derived class based on phone quality create the corresponding components.

Class diagram of Abstract Factory Pattern in theory

Class diagram of Abstract Factory Pattern for the example

The C++ code for this example is given as follows.

#include <iostream>
#include <string>
using namespace std;


class Board {
public:
 virtual void showBoard() = 0;
};

class ExpensiveBoard : public Board {
public:
 void showBoard() {
  cout << "this is an expensive board" << endl;
 }
};

class CheapBoard : public Board {
public:
 void showBoard() {
  cout << "This is a cheap board" << endl;
 }
};

class Monitor {
public:
 virtual void showMonitor() = 0;
};

class ExpensiveMonitor : public Monitor {
public:
 void showMonitor() {
  cout << "This is an expensive monitor" << endl;
 }
};

class CheapMonitor : public Monitor {
public:
 void showMonitor() {
  cout << "This is a cheap monitor" << endl;
 }
};

class AbstractFactory {
public:
 virtual Board *makeBoard() = 0;
 virtual Monitor *makeMonitor() = 0;
};

class ExpensivePhoneFactory : public AbstractFactory {
public:
 Board *makeBoard() {
  return new ExpensiveBoard;
 }
 Monitor *makeMonitor() {
  return new ExpensiveMonitor;
 }
};

class MediumPhoneFactory : public AbstractFactory {
public:
 Board *makeBoard() {
  return new CheapBoard;
 }
 Monitor *makeMonitor() {
  return new ExpensiveMonitor;
 }
};

class CheapPhoneFactory : public AbstractFactory {
public:
 Board *makeBoard() {
  return new ExpensiveBoard;
 }
 Monitor *makeMonitor() {
  return new ExpensiveMonitor;
 }
};

class PhoneAssembler {
private:
 AbstractFactory *factoryType;
public:
 typedef enum {
  EXPENSIVE_PHONE,
  MEDIUM_PHONE,
  CHEAP_PHONE
 } PHONE_TYPE;
 PhoneAssembler(PHONE_TYPE type) {
  if (type == EXPENSIVE_PHONE) {
   factoryType = new ExpensivePhoneFactory;
  } else if (type == MEDIUM_PHONE) {
   factoryType = new MediumPhoneFactory;
  } else {
   factoryType = new CheapPhoneFactory;
  }
 }
 void assemble() {
  Board *board = factoryType->makeBoard();
  Monitor *monitor = factoryType->makeMonitor();
  board->showBoard();
  monitor->showMonitor();
  cout << "The phone is assembled" << endl;
 }
};

int main() {
 PhoneAssembler::PHONE_TYPE userInput = PhoneAssembler::EXPENSIVE_PHONE;        /*simulate user input*/
 PhoneAssembler *phoneAssembler = new PhoneAssembler(userInput);
 phoneAssembler->assemble();
 return 0;
}

5. Conclusion

We have learnt Simple Factory, Factory Method and Abstract Factory patterns. When we use each of them.

Simple factory pattern: when we have only one family of objects (E.g. create one phone from brand new phone family)
Factory method pattern: When we have multiple families of objects (E.g. create one phone from brand new phone family or like new phone family)
Abstract factory pattern: When we have multiple families of object components (E.g. Create one board and one monitor from board family and monitor family).

Source code of the above examples can download in here.

6. References

http://www.codeproject.com/Articles/716413/Factory-Method-Pattern-vs-Abstract-Factory-Pattern

http://www.codeproject.com/Articles/492900/From-No-Factory-to-Factory-Method

http://rahulrajatsingh.com/2015/02/understanding-and-implementing-factory-pattern-in-c/

Monday, October 31, 2016

Program to interface not implementation

In Object-Oriented Programming (OOP), the desired software is not only runnable but also extensible, maintainable and testable. In this post, I will present a principle in OOP: "Program to interface not implementation". Before digging deeper in this principle, i will present basic concepts about concrete class, abstract class and interface.

1. Basic concepts

1.1. Concrete class

Concrete class is a class that can be used to specify any specific entity. For example, class Car:

class Car{
 string color;
 string model;
public:
 string getColor(){
  return this->color;
 }
 void setColor(string color){
  this->color = color; 
 } 
 string getModel(){
  return this->model;
 }
 void setModel(string model){
  this->model = model;
 }
}

The class Car has two attributes color and model, so any object instantiated by class Car will be determistic with these attributes.

1.2. Abstract class

Suppose that we add a new method called getTax() to class Car. But this class does not know how to implement this because tax may be depended on car type (assuming car has two types: RegularCar and SupperCar). So we mark virtual keyword for the getTax() function. The class Car now is considered to be abstract.

class Car{
 string color;
 string model;
public:
 string getColor(){
  return this->color;
 }
 void setColor(string color){
  this->color = color; 
 } 
 string getModel(){
  return this->model;
 }
 void setModel(string model){
  this->model = model;
 }
 virtual float getTax();
}

We cannot use abstract class directly and the classes inheriting from this class have to provide implementation of the virtual methods.

class SportCar : public Car{
public:
 float getTax(){
   return 2000f;
 }
}

1.3. Interface

In C++ we may consider an interface as a pure abstract class (with no implementation code). For example:

class Shape{
public:
 virtual float getArea();
 virtual float getPerimeter();
}

class Square : public Shape{
 int sideLength;
public:
 float getArea(){
  return sideLength*sideLength;
 }
 float getPerimeter(){
  return 4*sideLength;
 }
}

1.4. Why do we need Interfaces and Abstract Classes

We use interfaces and abstract class when:
- We need multiple class to behave in a polymorphic way
- We need some kind off contract to enforce on the classes.

2. Program to an interface, not an implementation

This principle means uses interfaces from other part of the application rather than the concrete implementation. When we work on a software, we often spend much time maintaining than developing. If we programming to an implementation, when the requirement change, we need much more efforts in modifying code. For example:

class Dog{
public:
 void run(){
  cout <<"A dog is running";
 }
}

Class Collection{
public: 
 void performAnimalAction(){
  Dog *dog = new Dog;
  dog -> run();
 }
}

This example, use concrete implementation (Dog), it work fine if we has only Dog in Collection. Imagine that we add a new class called Cat to the class Collection later, the performAnimalAction() function will be messed up.

To handle this problem, we create a interface Animal, and use interface instead of concrete implementation in performAnimalAction() function.

class Animal{
public:
 virtual void run();
}

class Dog : public Animal{
public:
 void run(){
  cout <<"A dog is running";
 }
}

class Cat : public Animal{
public:
 void run(){
  cout <<"A cat is running";
 }
}
class Collection{
public: 
 void performAnimalAction(Animal *_animal){
  Animal *animal = _animal;
  animal->run();
 }
}

int main(){
 Collection *collection = new Collection;
 collection->performAnimalAction(new Dog);
 collection->performAnimalAction(new Cat);
}

Reference

http://www.codeproject.com/Articles/702246/Program-to-Interface-not-Implementation-Beginners

Friday, October 28, 2016

Algorithms

This page covers topics on algorithms.

1. Data structure
Stacks
Queues
Linked lists
Hash tables
Binary Search trees

2. Algorithms
Searching and sorting algorithms.
Binary tree
Greedy algorithms
Dynamic programming
Elementary graph algorithms
Minimum Spanning Tree

Wednesday, October 26, 2016

Packet flow in routing/switching

Situation 1: CPE1 talks with CPE2

- Assume at CPE1: ping 192.168.0.4

CPE1 checks its routing table to see where to send packet to. It detects that CPE2 is on same network.

Header of sending packet from CPE1.
SIP = 192.168.0.3, DIP = 192.168.0.4
SA = m1, DA=? (unknown yet)

CPE1 broadcasts ARP to know MAC address of 192.168.0.4. After this step it knows DA as m2. Consequently, Switch1 updates its learning table
MAC (m1) ------ Port (fe1)
MAC (m2) ------ Port (fe2)

At this time, CPE1 just sends packet to CPE2 with below header.
SIP = 192.168.0.3, DIP = 192.168.0.4
SA = m1, DA=m2

Situation 2: CPE1 talks with CPE3
- Assume at CPE1: ping 192.168.11.5

CPE1 checks its routing table to see where to send packet to. It detects that CPE3 is not on same network. It has to send packet to its gateway (192.168.0.1) at interface R1(Lan1)

Header of sending packet from CPE1.
SIP = 192.168.0.3, DIP = 192.168.11.5
SA = m1, DA=? (unknown yet)

CPE1 broadcasts ARP to know MAC address of 192.168.0.1. After this, CPE1 know MAC of gateway as a1. Simultaneously, Switch1 updates its learning table.
MAC (m1) ------ Port (fe1)
MAC (m2) ------ Port (fe2)
MAC (a1) ------ Port (fe0)

CPE1 then sends packet to gateway with below header
SIP = 192.168.0.3, DIP = 192.168.11.5
SA = m1, DA=a1

Next, Router1 receives the packet, it check DIP (192.168.11.5) in its routing table to see where to forward packet. Ok, it detects that the packet should be forwarded to interface R2(Lan2, 192.168.11.1).

Router1 creates below packet.
SIP = 192.168.0.3, DIP = 192.168.11.5
SA = a2, DA=?

Now, it broadcasts ARP to know MAC address of 192.168.11.5. After this, Router1 knows MAC of 192.168.11.5 as m3. Switch2 updates learning table as
MAC (m3) ------ Port (fe1)
MAC (a2) ------ Port (fe0)

Finally, Router1 sends packet to CPE3 as below header.
SIP = 192.168.0.3, DIP = 192.168.11.5
SA = a2, DA= m3

Take note
- Switch just works at L2 (MAC address)
- Router is to transmit packets between different networks based on routing table (L3, IP address).

Routing in router:
+ Static routing (route add ...)
+ Dynamic routing (using protocol such as RIP, which broadcasts periodically its routing table to neighborhood and update the best routes).

Q&A
1. Why Home Gateway doesn't need to use RIP protocol?

Thursday, October 6, 2016

[Network programming] Review socket

TCP client/server model.

UDP client/server model

(to be updated..)

Sunday, October 2, 2016

I/O Multiplexing

Mô hình I/O multiplexing được tóm tắt như sau:

Các hàm quan trọng:

int select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, const struct *timeout);

void FD_CLR(int fd, fd_set *fdset); //xóa bit tương ứng trong fdset

//sử dụng hàm này sau select để kiểm tra bit tương ứng trong fdset là được set hoặc không
int FD_ISSET(int fd, fd_set *fdset); 

void FD_SET(int fd, fd_set *fdset); //set bit trong *fdset ứng với fd

void FD_ZERO(fd_set *fdset); // xóa tất cả bit trong fdset

Nếu timeout là NULL, select bị block đến khi tồn tại fd có dữ liệu. Nếu timeout là 0, select trả lại lập tức sau khi kiểm tra các fd. Nếu timeout khác NULL và 0, select() returns khi một hoặc nhiều file discriptors trong các tập readset, wirteset, exceptset có dữ liệu cho I/O hoặc hết timeout.

Nếu trả lại thành công, select trả lại số fds có dữ liệu sẵn sàng.

Nếu không thành công, select trả lại -1 và errno. Một số errno như sau:
- EBADF: có một fd không hợp lệ
- EINTR: select bị ngắt bởi tín hiệu trước khi timeout
- EINVAL: khoảng timeout không hợp lệ

Implementation của fdset dựa trên 1 bit mask kiểu số nguyên, hoặc bit fields trong mảng số nguyên.

Ví dụ sau về 1 hàm bị block cho tới khi 1 trong 2 fd sẵn sàng có dữ liệu.

#include errno.h
#include sys/select.h

int whicisready(int fd1, int fd2) {
 int maxfd;
 int nfds;
 fd_set readset;
 
 if (fd1 < 0 || fd1 >= FD_SETSIZE || fd2 < 0 || fd2 >= FD_SETSIZE)
 {
  errno = EINVAL;
  return -1;
 }
 
 maxfd = (fd1 > fd2) ? fd1: fd2;
 FD_ZERO(&readset);
 FD_SET(fd1,&readset);
 FD_SET(fd2,&readset);
 ndfs = select(maxfd+1, &readset, NULL, NULL, NULL);
 if (ndfs == -1)
  return -1;
 if (FD_ISSET(fd1, &readset))
  return fd1;
 if (FD_ISSET(fd2, &readset))
  return fd2;
 
 errno = EINVAL;
 return -1;
}

Friday, September 23, 2016

[C Programming] Exercises on Types, Operators and Expressions

We cover solution of exercises on types, operators and expressions in book "The C programming language".

Exercise 2-1. Write a program to determine the ranges of char, short, int, and long variables, both signed and unsigned, by printing appropriate values from standard headers and by direct computation. Harder if you compute them: determine the ranges of the various floating-point types.
Solution

Exercise 2-2. Write a loop equivalent to the for loop above without using && or ||.
Solution

Convert s to integer

Convert c to lower case

Pseudo-random generator

Exercise 2-3. Write a function htoi(s), which converts a string of hexadecimal digits (including an optional 0x or 0X) into its equivalent integer value. The allowable digits are 0 through 9, a through f, and A through F.
Solution

Squeeze

Strcat

Exercise 2-4. Write an alternative version of squeeze(s1,s2) that deletes each character in s1 that matches any character in the string s2.
Solution

Exercise 2-5. Write the function any(s1,s2), which returns the first location in a string s1 where any character from the string s2 occurs, or -1 if s1 contains no characters from s2. (The standard library function strpbrk does the same job but returns a pointer to the location.)
Solution

Exercise 2-6. Write a function setbits(x,p,n,y) that returns x with the n bits that begin at position p set to the rightmost n bits of y, leaving the other bits unchanged.
Solution

Exercise 2-7. Write a function invert(x,p,n) that returns x with the n bits that begin at position p inverted (i.e., 1 changed into 0 and vice versa), leaving the others unchanged.
Solution

Exercise 2-8. Write a function rightrot(x,n) that returns the value of the integer x rotated to the right by n positions.
Solution

Bit count

Exercise 2-9. In a two's complement number system, x &= (x-1) deletes the rightmost 1-bit in x. Explain why. Use this observation to write a faster version of bitcount.
Solution

Exercise 2-10. Rewrite the function lower, which converts upper case letters to lower case, with a conditional expression instead of if-else.
Solution

[C Programming] Exercises on Control flow

We cover solution of exercises on Control flow in book "The C programming language".

Binary search

Exercise 3-1. Our binary search makes two tests inside the loop, when one would suffice (at the price of more tests outside.) Write a version with only one test inside the loop and measure the difference in run-time.
Solution

Count digit

Exercise 3-2. Write a function escape(s,t) that converts characters like newline and tab into visible escape sequences like \n and \t as it copies the string t to s. Use a switch. Write a function for the other direction as well, converting escape sequences into the real characters.
Solution

Convert string to integer

Shellsort

Reverse string

Exercise 3-3. Write a function expand(s1,s2) that expands shorthand notations like a-z in the string s1 into the equivalent complete list abc...xyz in s2. Allow for letters of either case and digits, and be prepared to handle cases like a-b-c and a-z0-9 and -a-z. Arrange that a leading or trailing - is taken literally.
Solution

Convert n to characters

Exercise 3-4. In a two's complement number representation, our version of itoa does not handle the largest negative number, that is, the value of n equal to -(2wordsize-1). Explain why not. Modify it to print that value correctly, regardless of the machine on which it runs.
Solution

Exercise 3-5. Write the function itob(n,s,b) that converts the integer n into a base b character representation in the string s. In particular, itob(n,s,16) formats s as a hexadecimal integer in s.
Solution

Exercise 3-6. Write a version of itoa that accepts three arguments instead of two. The third argument is a minimum field width; the converted number must be padded with blanks on the left if necessary to make it wide enough.
Solution

Strim

Goto

[C Programming] Functions and Program structure

We cover solution of exercises on Functions and Program structure in book "The C program language".

Power

Longest line

Exercise 1.15. Rewrite the temperature conversion program of Section 1.2 to use a function for conversion.
Solution

Get line

Exercise 1-20. Write a program detab that replaces tabs in the input with the proper number of blanks to space to the next tab stop. Assume a fixed set of tab stops, say every n columns. Should n be a variable or a symbolic parameter?
Solution

Exercise 1-21. Write a program entab that replaces strings of blanks by the minimum number of tabs and blanks to achieve the same spacing. Use the same tab stops as for detab. When either a tab or a single blank would suffice to reach a tab stop, which should be given preference?
Solution

Exercise 1-22. Write a program to ``fold'' long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
Solution

Exercise 1-23. Write a program to remove all comments from a C program. Don't forget to handle quoted strings and character constants properly. C comments don't nest.
Solution

Exercise 1-24. Write a program to check a C program for rudimentary syntax errors like unmatched parentheses, brackets and braces. Don't forget about quotes, both single and double, escape sequences, and comments. (This program is hard if you do it in full generality.)
Solution

Exercise 4-1. Write the function strindex(s,t) which returns the position of the rightmost occurrence of t in s, or -1 if there is none.
Solution

Convert string to double

Exercise 4-2. Extend atof to handle scientific notation of the form
123.45e-6
where a floating-point number may be followed by e or E and an optionally signed exponent.
Solution

Reverse Polish calculator

Exercise 4-3. Given the basic framework, it's straightforward to extend the calculator. Add the modulus (%) operator and provisions for negative numbers.

Exercise 4-4. Add the commands to print the top elements of the stack without popping, to duplicate it, and to swap the top two elements. Add a command to clear the stack.

Exercise 4-5. Add access to library functions like sin, exp, and pow. See <math.h> in Appendix B, Section 4.

Exercise 4-6. Add commands for handling variables. (It's easy to provide twenty-six variables with single-letter names.) Add a variable for the most recently printed value.

Exercise 4-7. Write a routine ungets(s) that will push back an entire string onto the input. Should ungets know about buf and bufp, or should it just use ungetch?

Exercise 4-8. Suppose that there will never be more than one character of pushback. Modify getch and ungetch accordingly.

Exercise 4-9. Our getch and ungetch do not handle a pushed-back EOF correctly. Decide what their properties ought to be if an EOF is pushed back, then implement your design.

Exercise 4-10. An alternate organization uses getline to read an entire input line; this makes getch and ungetch unnecessary. Revise the calculator to use this approach.

Exercise 4-11. Modify getop so that it doesn't need to use ungetch. Hint: use an internal static variable.

Quick sort

Exercise 4-12. Adapt the ideas of printd to write a recursive version of itoa; that is, convert an integer into a string by calling a recursive routine.

Exercise 4-13. Write a recursive version of the function reverse(s), which reverses the string s in place.

Exercise 4-14. Define a macro swap(t,x,y) that interchanges two arguments of type t. (Block structure will help.)

Thursday, September 22, 2016

[C Programming] Exercises on Structures

We cover exercises on Structures in book "The C Programming language".

Count word version 1

Count word version 2

Count word

Exercise 6-2. Write a program that reads a C program and prints in alphabetical order each group of variable names that are identical in the first 6 characters, but different somewhere thereafter. Don't count words within strings and comments. Make 6 a parameter that can be set from the command line.
Solution

Exercise 6-3. Write a cross-referencer that prints a list of all words in a document, and for each word, a list of the line numbers on which it occurs. Remove noise words like ``the,'' ``and,'' and so on.
Solution

Exercise 6-4. Write a program that prints the distinct words in its input sorted into decreasing order of frequency of occurrence. Precede each word by its count.
Solution

Exercise 6-5. Write a function undef that will remove a name and definition from the table maintained by lookup and install.
Solution

Exercise 6-6. Implement a simple version of the #define processor (i.e., no arguments) suitable for use with C programs, based on the routines of this section. You may also find getch and ungetch helpful.
Solution

Wednesday, September 21, 2016

[C Programming] Exercises on Pointer and Array

We cover exercises on Pointer and Array in book "The C programming language".

Longest line

Exercise 1-16. Revise the main routine of the longest-line program so it will correctly print the length of arbitrary long input lines, and as much as possible of the text.
Solution

Exercise 1-17. Write a program to print all input lines that are longer than 80 characters.
Solution

Exercise 1-18. Write a program to remove trailing blanks and tabs from each line of input, and to delete entirely blank lines.
Solution

Exercise 1-19. Write a function reverse(s) that reverses the character string s. Use it to write a program that reverses its input a line at a time.
Solution

Count digits

Exercise 1-13. Write a program to print a histogram of the lengths of words in its input. It is easy to draw the histogram with the bars horizontal; a vertical orientation is more challenging.
Solution

Exercise 1-14. Write a program to print a histogram of the frequencies of different characters in its input.
Solution

Get Int

Exercise 5-1. As written, getint treats a + or - not followed by a digit as a valid representation of zero. Fix it to push such a character back on the input.
Solution

Exercise 5-2. Write getfloat, the floating-point analog of getint. What type does getfloat return as its function value?
Solution

Strlen

Strcpy

Strcmp

Exercise 5-3. Write a pointer version of the function strcat that we showed in Chapter 2: strcat(s,t) copies the string t to the end of s.
Solution

Exercise 5-4. Write the function strend(s,t), which returns 1 if the string t occurs at the end of the string s, and zero otherwise.
Solution

Exercise 5-5. Write versions of the library functions strncpy, strncat, and strncmp, which operate on at most the first n characters of their argument strings. For example, strncpy(s,t,n) copies at most n characters of t to s. Full descriptions are in Appendix B.
Solution

Exercise 5-6. Rewrite appropriate programs from earlier chapters and exercises with pointers instead of array indexing. Good possibilities include getline (Chapters 1 and 4), atoi, itoa, and their variants (Chapters 2, 3, and 4), reverse (Chapter 3), and strindex and getop (Chapter 4).
Solution

Quicksoft

Exercise 5-7. Rewrite readlines to store lines in an array supplied by main, rather than calling alloc to maintain storage. How much faster is the program?
Solution

Day of year

Exercise 5-8. There is no error checking in day_of_year or month_day. Remedy this defect.
Solution

Month name

Exercise 5-9. Rewrite the routines day_of_year and month_day with pointers instead of indexing.
Solution

Pattern-finding

Exercise 5-10. Write the program expr, which evaluates a reverse Polish expression from the command line, where each operator or operand is a separate argument. For example,
expr 2 3 4 + *
evaluates 2 * (3+4).
Solution

Exercise 5-11. Modify the program entab and detab (written as exercises in Chapter 1) to accept a list of tab stops as arguments. Use the default tab settings if there are no arguments.
Solution

Exercise 5-12. Extend entab and detab to accept the shorthand
entab -m +n
to mean tab stops every n columns, starting at column m. Choose convenient (for the user) default behavior.
Solution

Exercise 5-13. Write the program tail, which prints the last n lines of its input. By default, n is set to 10, let us say, but it can be changed by an optional argument so that
tail -n
prints the last n lines. The program should behave rationally no matter how unreasonable the input or the value of n. Write the program so it makes the best use of available storage; lines should be stored as in the sorting program of Section 5.6, not in a two-dimensional array of fixed size.
Solution

Qsort

Exercise 5-14. Modify the sort program to handle a -r flag, which indicates sorting in reverse (decreasing) order. Be sure that -r works with -n.
Solution

Exercise 5-15. Add the option -f to fold upper and lower case together, so that case distinctions are not made during sorting; for example, a and A compare equal.
Solution

Exercise 5-16. Add the -d (``directory order'') option, which makes comparisons only on letters, numbers and blanks. Make sure it works in conjunction with -f.
Solution

Exercise 5-17. Add a field-searching capability, so sorting may bee done on fields within lines, each field sorted according to an independent set of options. (The index for this book was sorted with -df for the index category and -n for the page numbers.)
Solution

Parse Declarator

Exercise 5-18. Make dcl recover from input errors.
Solution

Exercise 5-19. Modify undcl so that it does not add redundant parentheses to declarations.
Solution

Exercise 5-20. Expand dcl to handle declarations with function argument types, qualifiers like const, and so on.
Solution

Tuesday, September 20, 2016

[C programming] Exercises on Input, output

We cover exercises on Input, Output from book "The C programming language".

File copy

Count characters in input

Count line

Exercise 1-8. Write a program to count blanks, tabs, and newlines.
Solution

Exercise 1-9. Write a program to copy its input to its output, replacing each string of one or more blanks by a single blank.
Solution

Exercise 1-10. Write a program to copy its input to its output, replacing each tab by \t, each backspace by \b, and each backslash by \\. This makes tabs and backspaces visible in an unambiguous way.
Solution

Count word

Convert to lower case

Exercise 7-1. Write a program that converts upper case to lower or lower case to upper, depending on the name it is invoked with, as found in argv[0].
Solution

Exercise 7-2. Write a program that will print arbitrary input in a sensible way. As a minimum, it should print non-graphic characters in octal or hexadecimal according to local custom, and break long text lines.
Solution

Exercise 7-3. Revise minprintf to handle more of the other facilities of printf.
Solution

Exercise 7-4. Write a private version of scanf analogous to minprintf from the previous section.
Solution

Exercise 7-5. Rewrite the postfix calculator of Chapter 4 to use scanf and/or sscanf to do the input and number conversion.
Solution

Copy File

Fgets Fputs Getline

Exercise 7-6. Write a program to compare two files, printing the first line where they differ.
Solution

Exercise 7-7. Modify the pattern finding program of Chapter 5 to take its input from a set of named files or, if no files are named as arguments, from the standard input. Should the file name be printed when a matching line is found?
Solution

Exercise 7-8. Write a program to print a set of files, starting each new one on a new page, with a title and a running page count for each file.
Solution

Exercise 7-9. Functions like isupper can be implemented to save space or to save time. Explore both possibilities.
Solution

Greedy Algorithms

Definition

An algorithm is said to be greedy if it makes the locally optimal choice at each step with the hope of eventually reaching the globally optimal solution.

In some cases, greedy works, the solution is short and runs efficiently. For many others, however, it does not.

A problem must exhibit these two properties in order for a greedy algorithm to work:

It has optimal sub-structures. Optimal solution to the problem contains optimal solutions to the sub-problems.
It has the greedy property. If we make a choice that seems like the best at the moment and proceed to solve the remaining subproblem, we reach the optimal solution. We will never have to reconsider our previous choices.

Elements of the greedy strategy

To develop a greedy algorithm, we go through the following steps:

Cast the optimization problem as one in which we make a choice and are left with one subproblem to solve.
Prove that there is always an optimal solution to the original problem that makes the greedy choice, so that the greedy choice is always safe.
Demonstrate optimal substructure by showing that, having made the greedy choice, what remains is a subproblem with the property that if we combine an optimal solution to the subproblem with the greedy choice we have made, we arrive an optimal solution to the original problem.

Greedy versus dynamic programming

Both greedy and dynamic programming strategies exploit optimal substructure. In greedy algorithm, we can assemble a globally optimal solution by making locally optimal choices. In more detail, we make the choice that looks best in the current problem, without considering results from subproblems. In dynamic programming, we make a choice at each step, but the choice usually depends on the solutions to the problem.

Coin change problem

Problem discription: Given a target amount V cents and a list of denominations of n coins, i.e. we have coin_value[i] (in cents) for coin types i = 1..n-1, what is the minimum number of coins that we must use to represent amount V? Assume that we have an unlimited supply of coins of any type.
Example: if n = 4, coin_value = {25, 10, 5, 1} cents, and we want to represent V = 42 cents, we can use this Greedy algorithm: Select the largest coint denomination wich is not greater than the remaining amount, i.e. 42 - 35 = 17 , 17 - 10 = 7 , 7 - 5 = 2 , 2 - 1 = 1, 1 - 1 = 0., a total of 5 coins. This is optimal.

The problem above has the two ingredients required for a successful greedy algorithm.

It has optimal substructures: We have seen that in our quest to represent 42 cents, we used 25 + 10 + 5 + 1 + 1. This is an optimal 5-coin solution to the original problem. i.e. to present 17 cents we can use 10 + 5 + 1 + 1 (part of the solution for 42 cents), to represent 7 cents, we can use 5 + 1 + 1 (also part of te solution for 42 cents).
It has the greedy property: Given every amount V, we can greedily subtract from it the largest coin denomination which is not greater than this amount V. It can be proven that using any other strategies will not lead to an optimal solution, at least for this set of coin denominations.

The following is the C code for the example.

#include <stdio.h>
#define COIN_TYPE 4

int main() {
 int coin_value[COIN_TYPE] = {25, 10, 5, 1};
 int total_value,i;
 printf("%s\n", "Enter total value: ");
 scanf("%d", &total_value);
 printf("%s", "Selected coins are: ");
 while ( total_value > 0) {
  for (i = 0; i < COIN_TYPE; i++) {
   if( total_value >= coin_value[i]){
    total_value -= coin_value[i];
    printf("%d\n", coin_value[i]);
    break;
   }
  }
 }
 return 0;
}

However, this greedy algorithm does not work for all sets of coin denominations. Take for example {4, 3, 1} cents. To make 6 cents with that set, a greedy algorithm would choose 3 coins {4, 1, 1} instead of the optimal solution that uses 2 coins {3, 3}.

Station balance problem

Given 1 <= C <= 5 chambers which can store 0, 1, or 2 specimens, 1 <= S <= 2c specimens and a list M of the masses of the S specimens, determine which chamber should store each specimen in order to minimize imbalance.

Denote A the average of the total mass in each of the C chambers. The imbalance is the sum of differences between the actual total mass in each chamber and the average total mass.

If S > C, then S - C specimans must be paired with a chamber already containing other specimens. So we create dummy specimens with mass 0. And use greedy algorithm to out specimens into each chambers.

C code for this example:

#include <stdio.h>
#include <math.h>
#define NUM_SPE 5   //number of specimens
#define NUM_CHB 3   //number of chambers
int specimens[NUM_SPE] = {5, 1, 2, 7, 6};
int optimal[2 * NUM_CHB];

void sort(int array[]) {
 int i, j;
 for (i = 0; i < NUM_SPE; i++) {
  for (j = i + 1; j < NUM_SPE; j++) {
   if (array[i] < array[j]) {
    int temp = array[i];
    array[i] = array[j];
    array[j] = temp;
   }
  }
 }
}
int main() {
 int i;
 sort(specimens);
 for ( i = 0; i < NUM_CHB; i++) {
  printf("Chamber %d has %d ", i, specimens[i]);
  if (2 * NUM_CHB - i <= NUM_SPE) {
   printf("and %d\n", specimens[2 * NUM_CHB - i - 1]);
  } else {
   printf("\n");
  }
 }
 return 0;
}

Huffman code

Huffman invented a greedy algorithm that constructs an optimal prefix code called a Huffman code. Huffman codes compress data very effectively: savings of 20% to 90% are typical. Suppose that we have a 100,000 character data that we wish to store compactly, we observe that the character in the file occur with the frequencies given by the following figure.

If we ned 3 bits to represent 6 characters: a = 000, b = 001, ..., f = 101. This method requires 300,000 bits to code the entire file.

The idea of Huffman code is giving frequent characters short codewords and infrequent characters long codewords. As in the above figure, the total bits are only 224,000. To enable to decompress Huffman code, each codeword must not be the prefix of the others. To generate code word we can use a binary tree as follow with assumption that the left (right) child node is corresponding to bit 0 (bit 1).

References

Steven Halim, Felix Halim, "Competitive programming", handbook for ACM ICPC and IOI contestants 2013

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, "Introduction to Algorithm", Third edition.

[C programming] UNIX system interface

We cover below parts.

- File descriptors
- Low level I/O
- Open, create, close, unlink
- Random acess - lseek

Monday, September 19, 2016

[C programming] Functions and program structure

We cover below parts.

- External variable
- Static variables
- Register variable
- The C preprocessor

Other parts(basics of fuctions, header rules, block structure, initialization and recursion) are skipped.

1. External variable
A variable is external if it is defined outside of any function. When we need to inform other source files about functions/global variables in a file. For functions, put function prototypes in header file. For variables, re-declare the global variable using the extern keyword in header file. extern informs compiler that variable defined somewhere else. extern helps to access/modify of global variable from other source files.

2. Variable scope (static variables and register variables)
scope - defines the region in which a variable is valid.

static keyword has two meanings, depending on where the static variable is declared. Outside a function, static variables/functions only visible within that file, not globally (cannot be extern’ed). Inside a function, static variables have following characteristics. First, they are still local to that function. Second, they are initialized only during program initialization. Third, they do not get reinitialized with each function call.

During execution, data processed in registers. Explicitly storing commonly used data in registers helps to minimize load/store overhead. We can explicitly declare certain variables as registers using register keyword. Registers do not reside in addressed memory; pointer of a register variable illegal.

Check more detailed about rule scope in below links.
https://www.tutorialspoint.com/cprogramming/c_scope_rules.htm
http://www.geeksforgeeks.org/scope-rules-in-c/

Static variable
http://quiz.geeksforgeeks.org/static-variables-in-c/

Register variable
http://www.geeksforgeeks.org/understanding-register-keyword/

3. The C Preprocessor
Check below link for a clearly overview on CPP.
https://gcc.gnu.org/onlinedocs/cpp/

[C programming] control flow

We cover below parts.
- Statements and blocks
- If-else
- Else-if
- Switch
- Loops -While and For
- Loops - Do While
- Break and Continue
- Goto and labels

1. Statements and blocks
An expression such as x = 0 or i++ or printf(...) becomes a statement when it is followed by a semicolon, as in

x = 0;

Braces { and } are used to group declarations and statements together into a compound statement, or block, so that they are syntactically equivalent to a single statement.

2. If-else

if (expression)
statement1
else
statement2

3. Else-If

if (expression)
statement
else if (expression)
statement
else if (expression)
statement
else if (expression)
statement
else
statement

4. Switch

switch (expression) {
case const-expr: statements
case const-expr: statements
default: statements

}
Below example to count digits, white space, others.

main() /* count digits, white space, others */
{
 int c, i, nwhite, nother, ndigit[10];
 
 nwhite = nother = 0;
 
 for (i = 0; i < 10; i++)
 ndigit[i] = 0;
 
 while ((c = getchar()) != EOF) {
  switch (c) {
  case '0': case '1': case '2': case '3': case '4':
  case '5': case '6': case '7': case '8': case '9':
  ndigit[c-'0']++;
  break;
  case ' ':
  case '\n':
  case '\t':
  nwhite++;
  break;
  default:
  nother++;
  break;
  }
 }
 
 printf("digits =");
 
 for (i = 0; i < 10; i++)
 printf(" %d", ndigit[i]);

 printf(", white space = %d, other = %d\n", nwhite, nother);
 
 return 0;
}

The break statement causes an immediate exit from the switch.

5. Loops - While and For
We have already encountered the while and for loops. In

while (expression)
statement

the expression is evaluated. If it is non-zero, statement is executed and expression is re-evaluated. This cycle continues until expression becomes zero, at which point execution resumes after statement.

The for statement

for (expr1; expr2; expr3)
statement

is equivalent to
expr1;
while (expr2) {
statement
expr3;
}

for (;;) {
...
}
is an ``infinite'' loop,

6. Loops - Do-While
do
statement
while (expression);

7. Break and Continue

The break statement provides an early exit from for, while, and do, just as from switch.

The continue statement is related to break, but less often used; it causes the next iteration of the enclosing for, while, or do loop to begin. In the while and do, this means that the test part is executed immediately; in the for, control passes to the increment step. The continue statement applies only to loops, not to switch.

8. Goto and labels
C provides the infinitely-abusable goto statement, and labels to branch to.

for ( ... )
for ( ... ) {
...
if (disaster)
goto error;
}

...
error:
/* clean up the mess */

[C programming] Types, operators and expressions

We cover below parts.

- Variable names
- Data types and sizes
- Constants
- Declarations
- Arithmetic operators
- Relational and logical operators
- Type conversions
- Increment and decrement operators
- Bitwise operators
- Assignment operators and expressions
- Conditional expressions

1. Variable names
There are some restrictions on the names of variables and symbolic constants. Names are made up of letters and digits; the first character must be a letter. The underscore ``_'' counts as a letter;

Don't begin variable names with underscore, however, since library routines often use such names. Upper and lower case letters are distinct, so x and X are two different names. Traditional C practice is to use lower case for variable names, and all upper case for symbolic constants.

Keywords like if, else, int, float, etc., are reserved: you can't use them as variable names.

2. Data types and sizes
There are only a few basic data types in C:
char - a single byte, capable of holding one character in the local character set
int - an integer, typically reflecting the natural size of integers on the host machine
float - single-precision floating point
double - double-precision floating point

In addition, there are a number of qualifiers that can be applied to these basic types. short and long apply to integers:
short int sh;

long int counter;

The intent is that short and long should provide different lengths of integers where practical; int will normally be the natural size for a particular machine. short is often 16 bits long, and int either 16 or 32 bits. Each compiler is free to choose appropriate sizes for its own hardware.

The qualifier signed or unsigned may be applied to char or any integer. unsigned numbers are always positive or zero, and obey the laws of arithmetic modulo 2n, where n is the number of bits in the type. So, for instance, if chars are 8 bits, unsigned char variables have values between 0 and 255, while signed chars have values between -128 and 127 (in a two's complement machine.) Whether plain chars are signed or unsigned is machine-dependent, but printable characters are always positive.

The standard headers <limits.h> and <float.h> contain symbolic constants for all of these sizes.

3. Constants
An integer constant like 1234 is an int. A long constant is written with a terminal l (ell) or L, as in 123456789L; Unsigned constants are written with a terminal u or U, and the suffix ul or UL indicates unsigned long.

Floating-point constants contain a decimal point (123.4) or an exponent (1e-2) or both; their type is double, unless suffixed. The suffixes f or F indicate a float constant;

The value of an integer can be specified in octal or hexadecimal instead of decimal. A leading 0 (zero) on an integer constant means octal; a leading 0x or 0X means hexadecimal.

A character constant is an integer, written as one character within single quotes, such as 'x'.

The character constant '\0' represents the character with value zero, the null character.

A constant expression is an expression that involves only constants. Such expressions may be evaluated at during compilation rather than run-time
i.e
#define MAXLINE 1000

char line[MAXLINE+1];

A string constant, or string literal, is a sequence of zero or more characters surrounded by double quotes, as in

"I am a string". String constants can be concatenated at compile time:
"hello, " "world"
is equivalent to

"hello, world"
A string constant is an array of characters. The internal representation of a string has a null character '\0' at the end.
The standard library function strlen(s) returns the length of its character string argument s, excluding the terminal '\0'.

Be careful to distinguish between a character constant and a string that contains a single character: 'x' is not the same as "x". The former is an integer, used to produce the numeric value of the letter x in the machine's character set. The latter is an array of characters that contains one character (the letter x) and a '\0'.

There is one other kind of constant, the enumeration constant. An enumeration is a list of constant integer values, as in

enum boolean { NO, YES };
Enumerations provide a convenient way to associate constant values with names, an alternative to #define.

4. Bitwise operators
C provides six operators for bit manipulation; these may only be applied to integral operands, that is, char, short, int, and long, whether signed or unsigned.
& - bitwise AND
| - bitwise inclusive OR
^ - bitwise exclusive OR
<< - left shift
>> - right shift
~ - one's complement (unary)

AND operator & is often used to mask off some set of bits.
OR operator | is used to turn bits on.

exclusive OR operator ^ sets a one in each bit position where its operands have different bits, and zero where they are the same.

We must distinguish the bitwise operators & and | from the logical operators && and ||. For example, if x is 1 and y is 2, then x & y is zero while x && y is one.

The shift operators << and >> perform left and right shifts of their left operand by the number of bit positions given by the right operand, which must be non-negative.
i.e x << 2 (equivalent to x^2).

The unary operator ~ yields the one's complement of an integer; that is, it converts each 1-bit into a 0-bit and vice versa.

5. Assignment Operators and Expressions
An expression such as

i = i + 2
can be compressed as
i += 2
The operator += is called an assignment operator.

Most binary operators have a corresponding assignment operator op=, where op is one of
+ - * / % << >> & ^ |
If expr1 and expr2 are expressions, then
expr1 op= expr2
is equivalent to

expr1 = (expr1) op (expr2)

i.e. x *= y + 1
means
x = x * (y + 1)
rather than
x = x * y + 1

6. Conditional Expressions
The conditional expression
expr1 ? expr2 : expr3
the expression expr1 is evaluated first. If it is non-zero (true), then the expression expr2 is evaluated, and that is the value of the conditional expression. Otherwise expr3 is evaluated, and that is the value.

7. Precedence and Order of Evaluation
Below table summarizes the rules for precedence and associativity of all operators, including those that we have not yet discussed. Operators on the same line have the same precedence; rows are in order of decreasing precedence, so, for example, *, /, and % all have the same precedence, which is higher than that of binary + and -.

My Blog List