详解User Defined Java Class步骤(二) kettle中的user defined java class步骤,也称UDJC步骤,从4.0版本就有,功能非常强大,无所不能;可以在其中写任意代码,却不影响效率。本文将详细介绍在不同场景中用示例展示如果使用该步骤,由于内容非常多,便于
详解User Defined Java Class步骤(二)
kettle中的“user defined java class”步骤,也称UDJC步骤,从4.0版本就有,功能非常强大,无所不能;可以在其中写任意代码,却不影响效率。本文将详细介绍在不同场景中用示例展示如果使用该步骤,由于内容非常多,便于阅读方便,把内容分成三部分,请完整看完全部内容,示例代码在这里下载.
如果没有从第一部分开始,请访问第一部分。
使用步骤参数(Step Parameter)
如果你写了一段代码,如果想让带更通用,步骤参数这时就能用到;在示例中,我们提供一个正则表达式和一个字段的名称,该步骤检查参数对应的字段是否匹配正则表达式,如果是返回结果为1,反之为0。
代码如下:
import java.util.regex.Pattern;
private Pattern p = null;
private FieldHelper fieldToTest = null;
private FieldHelper outputField = null;
public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException
{
Object[] r = getRow();
if (r == null) {
setOutputDone();
return false;
}
// prepare regex and field helpers
if (first){
first = false;
String regexString = getParameter("regex");
p = Pattern.compile(regexString);
fieldToTest = get(Fields.In, getParameter("test_field"));
outputField = get(Fields.Out, "result");
}
r= createOutputRow(r, data.outputRowMeta.size());
// Get the value from an input field
String test_value = fieldToTest.getString(r);
// test for match and write result
if (p.matcher(test_value).matches()){
outputField.setValue(r, Long.valueOf(1));
}
else{
outputField.setValue(r, Long.valueOf(0));
}
// Send the row on to the next step.
putRow(data.outputRowMeta, r);
return true;
}
getParameter()方法返回在ui界面中定义的参数对应值内容,当然参数的值也可能是kettle的变量。把变量作为参数是使用变量通常的做法。我们可以在步骤的xml代码中手工搜索到变量。
示例的转换名称是:parameter.ktr.
消息步骤(Info Steps)使用
有时需要合并多个输入步骤,可能赋予不同的角色,就如流查询步骤。消息步骤用来提供查询,其数据行不通过getRow()方法返回。在udjc步骤中非常容易使用。在udjc步骤的ui界面消息步骤选项卡中定义,通过getRowsFrom()方法返回对应的值。
示例转换中使用消息步骤接收一组正则表达式,用其测试主流数据中的一个字段是否匹配,如果任何一个表达式匹配,结果字段设置为1.如果没有任何匹配,则结果为0,同时附加输出匹配的表达式。
代码如下:
import java.util.regex.Pattern;
import java.util.*;
private FieldHelper resultField = null;
private FieldHelper matchField = null;
private FieldHelper outputField = null;
private FieldHelper inputField = null;
private ArrayList patterns = newArrayList(20);
private ArrayList expressions = newArrayList(20);
public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException
{
Object[] r = getRow();
if (r == null) {
setOutputDone();
return false;
}
// prepare regex and field helpers
if (first){
first = false;
// get the input and output fields
resultField = get(Fields.Out, "result");
matchField = get(Fields.Out, "matched_by");
inputField = get(Fields.In, "value");
// get all rows from the info stream andcompile the regex field to patterns
FieldHelper regexField = get(Fields.Info, "regex");
RowSet infoStream = findInfoRowSet("expressions");
Object[] infoRow = null;
while((infoRow = getRowFrom(infoStream)) != null){
String regexString = regexField.getString(infoRow);
expressions.add(regexString);
patterns.add(Pattern.compile(regexString));
}
}
// get the value of the field to check
String value = inputField.getString(r);
// check if any pattern matches
int matchFound = 0;
String matchExpression = null;
for(int i=0;i if (((Pattern) patterns.get(i)).matcher(value).matches()){ matchFound = 1; matchExpression = (String)expressions.get(i); break; } } // write result to stream r= createOutputRow(r, data.outputRowMeta.size()); resultField.setValue(r, Long.valueOf(matchFound)); matchField.setValue(r, matchExpression); // Send the row on to the next step. putRow(data.outputRowMeta, r); return true; } 调用findInfoRowSet()方法,返回在udjc步骤的消息步骤中定义的名称对应的输入步骤的整个行集内容。从行集内容中读取某行与从主数据流中去某行不同,通过调用getRowFrom(),并显示指明那个行集。 示例转换的名称为info_steps.ktr. 使用目标步骤(Target Steps) 使用udjc步骤有时可能需要指定行集流转到不同的目标步骤。通过调用putRow()方法,并传递一个目标步骤作为参数。我们需要在udjc步骤的ui界面的目标步骤中定义所有可能的目标步骤,下面示例中随机分发行数据到不同弄的目标步骤。 findTargetRowSet()方法返回在ui界面中定义的目标步骤行集,并作为putRowto()方法的参数.示例转换的名称为target_steps.ktr. 代码如下: import java.util.regex.Pattern; import java.util.*; private RowSet lowProbStream = null; private RowSet highProbStream = null; public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException { Object[]r = getRow(); if(r == null) { setOutputDone(); returnfalse; } //prepare regex and field helpers if (first){ first = false; lowProbStream= findTargetRowSet("low_probability"); highProbStream= findTargetRowSet("high_probability"); } //Send the row on to the next step. if(Math.random()
putRowTo(data.outputRowMeta, r,lowProbStream); } else{ putRowTo(data.outputRowMeta, r,highProbStream); } returntrue; } 更多内容请查看第三部分;

MySQL uses a GPL license. 1) The GPL license allows the free use, modification and distribution of MySQL, but the modified distribution must comply with GPL. 2) Commercial licenses can avoid public modifications and are suitable for commercial applications that require confidentiality.

The situations when choosing InnoDB instead of MyISAM include: 1) transaction support, 2) high concurrency environment, 3) high data consistency; conversely, the situation when choosing MyISAM includes: 1) mainly read operations, 2) no transaction support is required. InnoDB is suitable for applications that require high data consistency and transaction processing, such as e-commerce platforms, while MyISAM is suitable for read-intensive and transaction-free applications such as blog systems.

In MySQL, the function of foreign keys is to establish the relationship between tables and ensure the consistency and integrity of the data. Foreign keys maintain the effectiveness of data through reference integrity checks and cascading operations. Pay attention to performance optimization and avoid common errors when using them.

There are four main index types in MySQL: B-Tree index, hash index, full-text index and spatial index. 1.B-Tree index is suitable for range query, sorting and grouping, and is suitable for creation on the name column of the employees table. 2. Hash index is suitable for equivalent queries and is suitable for creation on the id column of the hash_table table of the MEMORY storage engine. 3. Full text index is used for text search, suitable for creation on the content column of the articles table. 4. Spatial index is used for geospatial query, suitable for creation on geom columns of locations table.

TocreateanindexinMySQL,usetheCREATEINDEXstatement.1)Forasinglecolumn,use"CREATEINDEXidx_lastnameONemployees(lastname);"2)Foracompositeindex,use"CREATEINDEXidx_nameONemployees(lastname,firstname);"3)Forauniqueindex,use"CREATEU

The main difference between MySQL and SQLite is the design concept and usage scenarios: 1. MySQL is suitable for large applications and enterprise-level solutions, supporting high performance and high concurrency; 2. SQLite is suitable for mobile applications and desktop software, lightweight and easy to embed.

Indexes in MySQL are an ordered structure of one or more columns in a database table, used to speed up data retrieval. 1) Indexes improve query speed by reducing the amount of scanned data. 2) B-Tree index uses a balanced tree structure, which is suitable for range query and sorting. 3) Use CREATEINDEX statements to create indexes, such as CREATEINDEXidx_customer_idONorders(customer_id). 4) Composite indexes can optimize multi-column queries, such as CREATEINDEXidx_customer_orderONorders(customer_id,order_date). 5) Use EXPLAIN to analyze query plans and avoid

Using transactions in MySQL ensures data consistency. 1) Start the transaction through STARTTRANSACTION, and then execute SQL operations and submit it with COMMIT or ROLLBACK. 2) Use SAVEPOINT to set a save point to allow partial rollback. 3) Performance optimization suggestions include shortening transaction time, avoiding large-scale queries and using isolation levels reasonably.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 Chinese version
Chinese version, very easy to use

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Atom editor mac version download
The most popular open source editor
