详解kettle之UserDefinedJavaClass步骤（二）-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

详解kettle之UserDefinedJavaClass步骤（二）

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:02 PM

Detailed explanation

详解User Defined Java Class步骤（二） kettle中的user defined java class步骤，也称UDJC步骤，从4.0版本就有，功能非常强大，无所不能；可以在其中写任意代码，却不影响效率。本文将详细介绍在不同场景中用示例展示如果使用该步骤，由于内容非常多，便于

详解User Defined Java Class步骤（二）

kettle中的“user defined java class”步骤，也称UDJC步骤，从4.0版本就有，功能非常强大，无所不能；可以在其中写任意代码，却不影响效率。本文将详细介绍在不同场景中用示例展示如果使用该步骤，由于内容非常多，便于阅读方便，把内容分成三部分，请完整看完全部内容，示例代码在这里下载.

如果没有从第一部分开始，请访问第一部分。

使用步骤参数（Step Parameter）

如果你写了一段代码，如果想让带更通用，步骤参数这时就能用到；在示例中，我们提供一个正则表达式和一个字段的名称，该步骤检查参数对应的字段是否匹配正则表达式，如果是返回结果为1，反之为0。

代码如下：

import java.util.regex.Pattern;

private Pattern p = null;

private FieldHelper fieldToTest = null;

private FieldHelper outputField = null;

public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException

{

Object[] r = getRow();

if (r == null) {

setOutputDone();

return false;

}

// prepare regex and field helpers

if (first){

first = false;

String regexString = getParameter("regex");

p = Pattern.compile(regexString);

fieldToTest = get(Fields.In, getParameter("test_field"));

outputField = get(Fields.Out, "result");

}

r= createOutputRow(r, data.outputRowMeta.size());

// Get the value from an input field

String test_value = fieldToTest.getString(r);

// test for match and write result

if (p.matcher(test_value).matches()){

outputField.setValue(r, Long.valueOf(1));

}

else{

outputField.setValue(r, Long.valueOf(0));

}

// Send the row on to the next step.

putRow(data.outputRowMeta, r);

return true;

}

getParameter（）方法返回在ui界面中定义的参数对应值内容，当然参数的值也可能是kettle的变量。把变量作为参数是使用变量通常的做法。我们可以在步骤的xml代码中手工搜索到变量。

示例的转换名称是:parameter.ktr.

消息步骤(Info Steps)使用

有时需要合并多个输入步骤，可能赋予不同的角色，就如流查询步骤。消息步骤用来提供查询，其数据行不通过getRow（）方法返回。在udjc步骤中非常容易使用。在udjc步骤的ui界面消息步骤选项卡中定义，通过getRowsFrom()方法返回对应的值。

示例转换中使用消息步骤接收一组正则表达式，用其测试主流数据中的一个字段是否匹配，如果任何一个表达式匹配，结果字段设置为1.如果没有任何匹配，则结果为0，同时附加输出匹配的表达式。

代码如下：

import java.util.regex.Pattern;

import java.util.*;

private FieldHelper resultField = null;

private FieldHelper matchField = null;

private FieldHelper outputField = null;

private FieldHelper inputField = null;

private ArrayList patterns = newArrayList(20);

private ArrayList expressions = newArrayList(20);

public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException

{

Object[] r = getRow();

if (r == null) {

setOutputDone();

return false;

}

// prepare regex and field helpers

if (first){

first = false;

// get the input and output fields

resultField = get(Fields.Out, "result");

matchField = get(Fields.Out, "matched_by");

inputField = get(Fields.In, "value");

// get all rows from the info stream andcompile the regex field to patterns

FieldHelper regexField = get(Fields.Info, "regex");

RowSet infoStream = findInfoRowSet("expressions");

Object[] infoRow = null;

while((infoRow = getRowFrom(infoStream)) != null){

String regexString = regexField.getString(infoRow);

expressions.add(regexString);

patterns.add(Pattern.compile(regexString));

}

// get the value of the field to check

String value = inputField.getString(r);

// check if any pattern matches

int matchFound = 0;

String matchExpression = null;

for(int i=0;i

if (((Pattern) patterns.get(i)).matcher(value).matches()){

matchFound = 1;

matchExpression = (String)expressions.get(i);

break;

}

// write result to stream

r= createOutputRow(r, data.outputRowMeta.size());

resultField.setValue(r, Long.valueOf(matchFound));

matchField.setValue(r, matchExpression);

// Send the row on to the next step.

putRow(data.outputRowMeta, r);

return true;

}

调用findInfoRowSet（）方法，返回在udjc步骤的消息步骤中定义的名称对应的输入步骤的整个行集内容。从行集内容中读取某行与从主数据流中去某行不同，通过调用getRowFrom（），并显示指明那个行集。

示例转换的名称为info_steps.ktr.

使用目标步骤（Target Steps）

使用udjc步骤有时可能需要指定行集流转到不同的目标步骤。通过调用putRow（）方法，并传递一个目标步骤作为参数。我们需要在udjc步骤的ui界面的目标步骤中定义所有可能的目标步骤，下面示例中随机分发行数据到不同弄的目标步骤。

findTargetRowSet（）方法返回在ui界面中定义的目标步骤行集，并作为putRowto（）方法的参数.示例转换的名称为target_steps.ktr.

代码如下：

import java.util.regex.Pattern;

import java.util.*;

private RowSet lowProbStream = null;

private RowSet highProbStream = null;

public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException

{

Object[]r = getRow();

if(r == null) {

setOutputDone();

returnfalse;

}

//prepare regex and field helpers

if (first){

first = false;

lowProbStream= findTargetRowSet("low_probability");

highProbStream= findTargetRowSet("high_probability");

}

//Send the row on to the next step.

if(Math.random()

putRowTo(data.outputRowMeta, r,lowProbStream);

}

else{

putRowTo(data.outputRowMeta, r,highProbStream);

}

returntrue;

}

更多内容请查看第三部分；

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How does MySQL's licensing compare to other database systems?Apr 25, 2025 am 12:26 AM

MySQL uses a GPL license. 1) The GPL license allows the free use, modification and distribution of MySQL, but the modified distribution must comply with GPL. 2) Commercial licenses can avoid public modifications and are suitable for commercial applications that require confidentiality.

When would you choose InnoDB over MyISAM, and vice versa?Apr 25, 2025 am 12:22 AM

The situations when choosing InnoDB instead of MyISAM include: 1) transaction support, 2) high concurrency environment, 3) high data consistency; conversely, the situation when choosing MyISAM includes: 1) mainly read operations, 2) no transaction support is required. InnoDB is suitable for applications that require high data consistency and transaction processing, such as e-commerce platforms, while MyISAM is suitable for read-intensive and transaction-free applications such as blog systems.

Explain the purpose of foreign keys in MySQL.Apr 25, 2025 am 12:17 AM

In MySQL, the function of foreign keys is to establish the relationship between tables and ensure the consistency and integrity of the data. Foreign keys maintain the effectiveness of data through reference integrity checks and cascading operations. Pay attention to performance optimization and avoid common errors when using them.

What are the different types of indexes in MySQL?Apr 25, 2025 am 12:12 AM

There are four main index types in MySQL: B-Tree index, hash index, full-text index and spatial index. 1.B-Tree index is suitable for range query, sorting and grouping, and is suitable for creation on the name column of the employees table. 2. Hash index is suitable for equivalent queries and is suitable for creation on the id column of the hash_table table of the MEMORY storage engine. 3. Full text index is used for text search, suitable for creation on the content column of the articles table. 4. Spatial index is used for geospatial query, suitable for creation on geom columns of locations table.

How do you create an index in MySQL?Apr 25, 2025 am 12:06 AM

TocreateanindexinMySQL,usetheCREATEINDEXstatement.1)Forasinglecolumn,use"CREATEINDEXidx_lastnameONemployees(lastname);"2)Foracompositeindex,use"CREATEINDEXidx_nameONemployees(lastname,firstname);"3)Forauniqueindex,use"CREATEU

How does MySQL differ from SQLite?Apr 24, 2025 am 12:12 AM

The main difference between MySQL and SQLite is the design concept and usage scenarios: 1. MySQL is suitable for large applications and enterprise-level solutions, supporting high performance and high concurrency; 2. SQLite is suitable for mobile applications and desktop software, lightweight and easy to embed.

What are indexes in MySQL, and how do they improve performance?Apr 24, 2025 am 12:09 AM

Indexes in MySQL are an ordered structure of one or more columns in a database table, used to speed up data retrieval. 1) Indexes improve query speed by reducing the amount of scanned data. 2) B-Tree index uses a balanced tree structure, which is suitable for range query and sorting. 3) Use CREATEINDEX statements to create indexes, such as CREATEINDEXidx_customer_idONorders(customer_id). 4) Composite indexes can optimize multi-column queries, such as CREATEINDEXidx_customer_orderONorders(customer_id,order_date). 5) Use EXPLAIN to analyze query plans and avoid

Explain how to use transactions in MySQL to ensure data consistency.Apr 24, 2025 am 12:09 AM

Using transactions in MySQL ensures data consistency. 1) Start the transaction through STARTTRANSACTION, and then execute SQL operations and submit it with COMMIT or ROLLBACK. 2) Use SAVEPOINT to set a save point to allow partial rollback. 3) Performance optimization suggestions include shortening transaction time, avoiding large-scale queries and using isolation levels reasonably.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

4 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

4 weeks agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

1 months agoByDDD

Atomfall guide: item locations, quest guides, and tips

1 months agoByDDD

Hot Tools

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.