尝试Mahout
免费的东西都是祸害,dataguru免费给了学习卡,结果为了在过期前花掉,报了个Mahout,发觉是要把自己逼疯的节奏~
1.先得弄好Hadoop环境,这个是昊哥去年的任务,不是我的呀~没法子,补~
2.得有数据挖据的理论基础
3.得有一些数学基础,不是大数据的统计学基础刚开始么?
老老实实得在Ubuntu1204版上去配个伪分布式的Hadoop吧,基于2.2.0
安装手顺有很多,不多记录了。
1.配置好JAVA
2.配置ssh为不需要口令认证
3.4个.xml文件:
-core-site.xml
-hdfs-site.xml
-mapred-site.xml
-yarn-site.xml
4.配置etc/slaves中为localhost
5.启动
-初始化:hdfs namenode –format-启动: cd /opt/hadoop-2.2.0/sbin/
./hadoop-daemon.sh start datanode
./hadoop-daemon.sh start namenode
./hadoop-daemon.sh start secondarynamenode
./yarn-daemon.sh start resourcemanager
./yarn-daemon.sh start nodemanager
6.启动管理接口,确认
7.结果试图运行的时候报错:
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library
/opt/hadoop-2.2.0/lib/native/libh
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded
library /opt/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might
have disabled stack guard. The VM will try to fix the stack guard
now.
It′s highly recommended that you fix the library with ′execstack -c
<libfile>′, or link it with ′-z
noexecstack′.
WARN util.NativeCodeLoader: Unable to load native-hadoop library
for your platform… using builtin-java classes where
applicable
为了解决这个问题需要安装Maven,然后重新再native编译hadoop2.2.0的源代码。
CSDN上有大神已经有guide,只是难道居然没有碰到ERROR么?http://blog.csdn.net/focusheart/article/details/14058153
会报错:
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile
(default-testCompile) on project hadoop-auth: Compilation failure:
Compilation failure:
[ERROR]
/home/chuan/trunk/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:[84,13]
cannot access org.mortbay.component.AbstractLifeCycle
[ERROR] class file for org.mortbay.component.AbstractLifeCycle not
found
对应方式:https://issues.apache.org/jira/browse/HADOOP-10110
Index: hadoop-common-project/hadoop-auth/pom.xml
===================================================================
hadoop-common-project/hadoop-auth/pom.xml
(revision 1543124)
+++
hadoop-common-project/hadoop-auth/pom.xml
(working copy)
@@ -54,6 +54,11 @@
</dependency>
<dependency>
<groupId>org.mortbay.jetty</groupId>
+
<artifactId>jetty-util</artifactId>
+
<scope>test</scope>
+
</dependency>
+
<dependency>
+
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty</artifactId>
<scope>test</scope>
</dependency>
编译完毕后,把hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib中的native的文件拷贝到/opt/hadoop-2.2.0/lib中
太晚了,就不测试了,明天继续~
下面是简单的Hadoop的个人理解的路线图~