Storage system based on virtualization and distributed technology

Build an open source Hadoop storage application foundation through the FreeBSD system and rely on the server virtualization (VMware) platform to run, so that it can have faster, more stable, and more secure hardware guarantees, using iSCSI technology to minimize storage deployment costs . This system uses the VMware virtualization platform to integrate the server hardware storage resources, divide the server's disk array by establishing Lun, and form multiple disk logic, and then install the FreeBSD operating system on Lun and build the iSCSI server side to make the storage hardware Resources can be flexibly applied in the Hadoop system. Hadoop will be deployed on the virtualized hardware platform to form a distributed file system, and establish application communication with the client server through the WebDAV protocol. Users can access the client server and transfer the files to the Hadoop storage cluster via WebDAV via HTTPS to save them.
The design of the platform makes full use of the characteristics of virtualization and distributed technology, and adopts multi-level modular applications to make the entire storage system flexible and easy to expand from the hardware architecture to the software application. At the same time, because of virtualization and distributed The security characteristics of the technology itself, the system has inherent advantages in data security, so as to achieve low-cost deployment of data storage services.

1 System Design Principle The storage system uses the underlying cloud storage technology and application layer iSCSI technology to provide users with cross-system application platform support. Its working principle is shown in Figure 1.
The system first consists of multiple data storage servers to form a huge data storage service cluster through the iSCSI network. The configuration of each data server is the same. When the data reaches the saturation state of the storage pool, you can add servers of the same configuration to the storage network to achieve capacity expansion without changing the original system operating state.
The system uses the VMware ESXi Server virtual system as the underlying system of the application server cluster. Each application server system can establish a logical association on top of the VMware virtual system. VMware allows multiple operating systems to run in parallel on a high-performance server and multiple high-performance servers to run the same task. At the same time, the operating system is backed up and managed through the network, and the operating system can be migrated and copied according to the application service usage status , Thereby expanding the processing bandwidth of network applications.
Install the FreeBSD system platform on the VMware layer to build a Hadoop distributed storage system. The Hadoop system can split the data into many small blocks and backup at the same time, and store it on different data storage servers through the on-demand server (NameNode). In the Hadoop system, there will be a Master, mainly responsible for the work of NameNode and JobTracker. Job Tracker's main responsibility is to start, track and schedule the execution of various Slave tasks. There will be multiple Slaves, and each Slave usually has the function of DataNode and is responsible for the task of Task Tracker. TaskTracker performs Map tasks and Reduce tasks in combination with local data according to application requirements.
Deploy the WebDAV application on the NameNode to enable the application server to communicate with the storage resources, so that users can call the data on Hadoop. WebDAV (Web-based Distributed Authoring and Versioning) is a communication protocol based on HTTP 1.1. It adds some extensions for HTTP 1.1 (that is, adds some new methods in addition to several HTTP standard methods such as GET, POST, HEAD), so that applications can directly write files to the Web Server, thereby replacing the traditional FTP transfer file mode.

2 System key technology implementation By deploying WebDAV on Hadoop, the storage platform can realize the client (application server) copying and moving files to the server side (Hadoop node server), and can perform the operations of multiple users reading a file at the same time.
Implementation steps (taking four servers as an example, combined with DNS servers in the local area network):
The first step: Hadoop environment to build users using Hadoop, the machine name and IP are the domain name vc1 (192.168.1.1), the domain name vc2 (192.168.1.2), the domain name vc3 (19 2.168.1 .3) and domain name vc4 (192.168.1.4). This is because vc3 is the Namenode of Hadoop in the four machines, and Datanode is the other.
The detailed environment configuration is introduced as follows:
The Hadoop version is 0.20.2;
The JDK version is 1.6.0;
The operating system is FreeBSD8.0 (minimal installation).
ve3 (192.168.1.3) is the NameNode (Master), and the other three are DateNode (slave).
Hadoop is a cluster program written in Java language. Its installation is based on ssh and JDK, so before configuring Hadoop, you must first install and configure the system with ssh and JDK.
(1) Use ssh to realize the passwordless access of users between Hadoop nodes â‘ Add the node IP and corresponding machine name in the / etc / hosts file of each node, and establish an account with the same user name and password on each node.
Modify the / etc / hosts file as follows:
196.18.11.1 vc1
192.168.1.2 vc2
192.168.1.3 vc3
192.168.1.4 vc4
After the modification is successful, the corresponding resolution between the IP address and the machine name can be achieved.
Create a user named Hadoop with a password of 123456 on each node.

Indoor Wall Hanging Speakers
Surround Speakers,Wall Hanging Speakers,Home Theater Speakers,Ultra Thin Wall Hanging
The ASI Audio Technology Co., Ltd , https://www.asi-sound.com