多表关联和单表关联类似,它也是通过对原始数据进行一定的处理,从其中挖掘出关心的信息。如下 输入的是两个文件,一个代表工厂表,包含工厂名列和地址编号列;另一个代表地址表,包含地址名列和地址编号列。要求从输入数据中找出工厂名和地址名的对应关系,
多表关联和单表关联类似,它也是通过对原始数据进行一定的处理,从其中挖掘出关心的信息。如下
输入的是两个文件,一个代表工厂表,包含工厂名列和地址编号列;另一个代表地址表,包含地址名列和地址编号列。要求从输入数据中找出工厂名和地址名的对应关系,输出工厂名-地址名表
样本如下:
factory:
factoryname addressed Beijing Red Star 1 Shenzhen Thunder 3 Guangzhou Honda 2 Beijing Rising 1 Guangzhou Development Bank 2 Tencent 3 Back of Beijing 1
address:
addressID addressname 1 Beijing 2 Guangzhou 3 Shenzhen 4 Xian
结果:
factoryname addressname Beijing Red Star Beijing Beijing Rising Beijing Bank of Beijing Beijing Guangzhou Honda Guangzhou Guangzhou Development Bank Guangzhou Shenzhen Thunder Shenzhen Tencent Shenzhen
代码如下:
import java.io.IOException; import java.util.*; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; public class MTjoin { public static int time = 0; /* * 在map中先区分输入行属于左表还是右表,然后对两列值进行分割, * 保存连接列在key值,剩余列和左右表标志在value中,最后输出 */ public static class Map extends Mapper { // 实现map函数 public void map(Object key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString();// 每行文件 String relationtype = new String();// 左右表标识 // 输入文件首行,不处理 if (line.contains("factoryname") == true || line.contains("addressed") == true) { return; } // 输入的一行预处理文本 StringTokenizer itr = new StringTokenizer(line); String mapkey = new String(); String mapvalue = new String(); int i = 0; while (itr.hasMoreTokens()) { // 先读取一个单词 String token = itr.nextToken(); // 判断该地址ID就把存到"values[0]" if (token.charAt(0) >= '0' && token.charAt(0) 0) { relationtype = "1"; } else { relationtype = "2"; } continue; } // 存工厂名 mapvalue += token + " "; i++; } // 输出左右表 context.write(new Text(mapkey), new Text(relationtype + "+"+ mapvalue)); } } /* * reduce解析map输出,将value中数据按照左右表分别保存, * 然后求出笛卡尔积,并输出。 */ public static class Reduce extends Reducer { // 实现reduce函数 public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { // 输出表头 if (0 == time) { context.write(new Text("factoryname"), new Text("addressname")); time++; } int factorynum = 0; String[] factory = new String[10]; int addressnum = 0; String[] address = new String[10]; Iterator ite = values.iterator(); while (ite.hasNext()) { String record = ite.next().toString(); int len = record.length(); int i = 2; if (0 == len) { continue; } // 取得左右表标识 char relationtype = record.charAt(0); // 左表 if ('1' == relationtype) { factory[factorynum] = record.substring(i); factorynum++; } // 右表 if ('2' == relationtype) { address[addressnum] = record.substring(i); addressnum++; } } // 求笛卡尔积 if (0 != factorynum && 0 != addressnum) { for (int m = 0; m <pre class="brush:php;toolbar:false"> javac -classpath hadoop-core-1.1.2.jar:/opt/hadoop-1.1.2/lib/commons-cli-1.2.jar -d firstProject firstProject/MTJoin.java
jar -cvf MTJoin.jar -C firstProject/ .
删除已经存在的output
hadoop fs -rmr output
hadoop fs -mkdir input
hadoop fs -put factory input
hadoop fs -put address input
运行
hadoop jar MTJoin.jar MTJoin input output
查看结果
hadoop fs -cat output/part-r-00000
?
作者:a331251021 发表于2013-8-4 16:20:52 原文链接
阅读:72 评论:0 查看评论
原文地址:hadoop实例---多表关联, 感谢原作者分享。

todropaviewInmySQL,使用“ dropviewifexistsview_name;” andTomodifyAview,使用“ createOrreplaceViewViewViewview_nameAsSelect ...”。whendroppingaview,asew dectivectenciesanduse和showcreateateviewViewview_name;“ tounderStanditSsstructure.whenModifying

mySqlViewScaneFectectialized unizedesignpatternslikeadapter,Decorator,Factory,andObserver.1)adapterPatternadaptSdataForomDifferentTablesIntoAunifiendView.2)decoratorPatternenhancateDataWithCalcalcualdCalcalculenfields.3)fieldfields.3)

查看InMysqlareBeneForsImplifyingComplexqueries,增强安全性,确保dataConsistency,andOptimizingPerformance.1)他们simimplifycomplexqueriesbleiesbyEncapsbyEnculatingThemintoreusableviews.2)viewsEnenenhancesecuritybyControllityByControllingDataAcces.3)

toCreateAsimpleViewInmySQL,USEthecReateaTeviewStatement.1)defitEtheetEtheTeViewWithCreatEaTeviewView_nameas.2)指定usethectstatementTorivedesireddata.3)usethectStatementTorivedesireddata.3)usetheviewlikeatlikeatlikeatlikeatlikeatlikeatable.views.viewssimplplifefifydataaccessandenenanceberity but consisterfort,butconserfort,consoncontorfinft

1)foralocaluser:createUser'localuser'@'@'localhost'Indidendify'securepassword'; 2)foraremoteuser:creationuser's creationuser'Remoteer'Remoteer'Remoteer'Remoteer'Remoteer'Remoteer'Remoteer'Remoteer'Rocaluser'@'localhost'Indidendify'seceledify'Securepassword'; 2)

mysqlviewshavelimitations:1)他们不使用Supportallsqloperations,限制DatamanipulationThroughViewSwithJoinSorsubqueries.2)他们canimpactperformance,尤其是withcomplexcomplexclexeriesorlargedatasets.3)

porthusermanagementInmysqliscialforenhancingsEcurityAndsingsmenting效率databaseoperation.1)usecReateusertoAddusers,指定connectionsourcewith@'localhost'or@'%'。

mysqldoes notimposeahardlimitontriggers,butacticalfactorsdeterminetheireffactective:1)serverConfiguration impactactStriggerGermanagement; 2)复杂的TriggerSincreaseSySystemsystem load; 3)largertablesslowtriggerperfermance; 4)highConconcConcrencerCancancancancanceTigrignecentign; 5); 5)


热AI工具

Undresser.AI Undress
人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover
用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool
免费脱衣服图片

Clothoff.io
AI脱衣机

Video Face Swap
使用我们完全免费的人工智能换脸工具轻松在任何视频中换脸!

热门文章

热工具

禅工作室 13.0.1
功能强大的PHP集成开发环境

WebStorm Mac版
好用的JavaScript开发工具

SublimeText3 英文版
推荐:为Win版本,支持代码提示!

SublimeText3汉化版
中文版,非常好用

PhpStorm Mac 版本
最新(2018.2.1 )专业的PHP集成开发工具