详解kettle之UserDefinedJavaClass步骤（三）-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

详解kettle之UserDefinedJavaClass步骤（三）

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:02 PM

Detailed explanation

详解User Defined Java Class步骤（三） kettle中的user defined java class步骤，也称UDJC步骤，从4.0版本就有，功能非常强大，无所不能；可以在其中写任意代码，却不影响效率。本文将详细介绍在不同场景中用示例展示如果使用该步骤，由于内容非常多，便于

详解User Defined Java Class步骤（三）

kettle中的“user defined java class”步骤，也称UDJC步骤，从4.0版本就有，功能非常强大，无所不能；可以在其中写任意代码，却不影响效率。本文将详细介绍在不同场景中用示例展示如果使用该步骤，由于内容非常多，便于阅读方便，把内容分成三部分，请完整看完全部内容，示例代码在这里下载.

如果没有看第二部分，请先访问第二部分。

错误处理

udjc步骤支持kettle的错误处理特性，从udjc步骤拖动一个连接到空步骤，接收错误数据行，右击udjc步骤，选择”Defined Error Handing”（定义错误处理）。弹出界面可以配置错误步骤接收错误数据，其他一些选项和字段名称可以配置扩展错误信息，在udjc步骤中，通过调用putError（）方法把错误数据转发的错误处理步骤。

public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException

{

Object[]r = getRow();

if(r == null) {

setOutputDone();

returnfalse;

}

if (first){

first = false;

}

r= createOutputRow(r, data.outputRowMeta.size());

// Get the value from an input field

Long numerator = get(Fields.In, "numerator").getInteger(r);

Long denominator = get(Fields.In,"denominator").getInteger(r);

//avoid dividing by 0

if(denominator == 0){

//putErro is declared as follows:

//public void putError(RowMetaInterface rowMeta, Object[] row, long nrErrors,String errorDescriptions, String fieldNames, String errorCodes)

putError(data.outputRowMeta,r, 1, "Denominator must be different from 0","denominator", "DIV_0");

//get on with the next line

returntrue;

}

longinteger_division = numerator / denominator;

longremainder = numerator % denominator;

//write output fields

get(Fields.Out, "integer_division").setValue(r,Long.valueOf(integer_division));

get(Fields.Out, "remainder").setValue(r,Long.valueOf(remainder));

//Send the row on to the next step.

putRow(data.outputRowMeta, r);

returntrue;

}

访问数据库连接

如果udjc步骤需要实现一些和数据库相关的功能，那么可以使用kettle功能获取其数据库连接。下面示例中使用了kettle中定义的“TestDB”数据库连接。输入行有一个“table_name”字段，该步骤检查输入的表是否存在，并把结果写入的输出结果中。

如果需要在udjc步骤中实现一些和数据库相关的重要工作，最好对源码中的org.pentaho.di.core.database包内容比较熟悉，也可以查看和DB相关的步骤和示例代码，了解如何使用database包相关类的使用。

importorg.pentaho.di.core.database.Database;

importjava.util.List;

importjava.util.Arrays;

privateDatabase db = null;

privateFieldHelper outputField = null;

private FieldHelpertableField = null;

privateList existingTables = null;

publicboolean processRow(StepMetaInterface smi, StepDataInterface sdi) throwsKettleException

{

Object[] r = getRow();

if (r == null) {

setOutputDone();

return false;

}

if (first){

first = false;

existingTables =Arrays.asList(db.getTablenames());

tableField = get(Fields.In,"table_name");

outputField = get(Fields.Out,"table_exists");

}

r = createOutputRow(r,data.outputRowMeta.size());

if (existingTables.contains(tableField.getString(r))){

outputField.setValue(r, Long.valueOf(1));

}

else{

outputField.setValue(r,Long.valueOf(0));

}

// Send the row on to the next step.

putRow(data.outputRowMeta, r);

return true;

}

public booleaninit(StepMetaInterface stepMetaInterface, StepDataInterface stepDataInterface)

{

if (parent.initImpl(stepMetaInterface,stepDataInterface)){

try{

db = newDatabase(this.parent, getTransMeta().findDatabase("TestDB"));

db.shareVariablesWith(this.parent);

db.connect();

return true;

}

catch(KettleDatabaseException e){

logError("Errorconnecting to TestDB: "+ e.getMessage());

setErrors(1);

stopAll();

}

return false;

}

publicvoid dispose(StepMetaInterface smi, StepDataInterface sdi)

{

if (db != null) {

db.disconnect();

}

parent.disposeImpl(smi, sdi);

}

示例udjc步骤中的重写了init（）和dispose（）方法，分别实现创建数据库连接和完成后断开连接。在转换初始化的时候，第一次执行processRow（）之前调用init（）方法。转换执行完成之后调用dispose（）方法。如果有首先要初始化的工作以及一些清理资源代码，就考虑分别放在init和dispose方法中。示例转换的名称：db_access.ktr。

实现输入步骤

有时udjc步骤本身就是输入步骤，这时其自己生成输入行，而无需其他的输入行步骤。下面示例展示生成java的系统属性列表作为输入行。

代码如下：

import java.util.*;

private ArrayList keys = null;

private int idx = 0;

public boolean processRow(StepMetaInterfacesmi, StepDataInterface sdi) throws KettleException

{

if(first){

first= false;

//get the system property names, output is done one at a time later

keys= Collections.list(System.getProperties().propertyNames());

idx= 0;

}

if(idx >= keys.size()) {

setOutputDone();

returnfalse;

}

//create a row

Object[]r = RowDataUtil.allocateRowData(data.outputRowMeta.size());

// Set key and value in a new output row

get(Fields.Out, "key").setValue(r, keys.get(idx));

get(Fields.Out,"value").setValue(r,System.getProperties().get(keys.get(idx)));

idx++;

//Send the row on to the next step.

putRow(data.outputRowMeta, r);

returntrue;

}

在代码中没有调用getRow方法获取输入行，而是第一次调用processRow方法是初始化java系统属性列表。这些属性被逐个写入到输出流中。因为没有输入行，代码通过RowDataUtil.allocateRowData()方法创建，然后设置字段值并传输到下一步骤中。示例转换的名称input_step.ktr。

总结

本文详细说明了udjc步骤在不同场景的使用方式。如果你需要自定义处理功能，但是javascript步骤实现不灵活或性能不够，这时可以考虑使用udjc步骤代替。为了学习更多的内容，我们也可以查看sample目录下的关于udjc的示例。

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

What are stored procedures in MySQL?May 01, 2025 am 12:27 AM

Stored procedures are precompiled SQL statements in MySQL for improving performance and simplifying complex operations. 1. Improve performance: After the first compilation, subsequent calls do not need to be recompiled. 2. Improve security: Restrict data table access through permission control. 3. Simplify complex operations: combine multiple SQL statements to simplify application layer logic.

How does query caching work in MySQL?May 01, 2025 am 12:26 AM

The working principle of MySQL query cache is to store the results of SELECT query, and when the same query is executed again, the cached results are directly returned. 1) Query cache improves database reading performance and finds cached results through hash values. 2) Simple configuration, set query_cache_type and query_cache_size in MySQL configuration file. 3) Use the SQL_NO_CACHE keyword to disable the cache of specific queries. 4) In high-frequency update environments, query cache may cause performance bottlenecks and needs to be optimized for use through monitoring and adjustment of parameters.

What are the advantages of using MySQL over other relational databases?May 01, 2025 am 12:18 AM

The reasons why MySQL is widely used in various projects include: 1. High performance and scalability, supporting multiple storage engines; 2. Easy to use and maintain, simple configuration and rich tools; 3. Rich ecosystem, attracting a large number of community and third-party tool support; 4. Cross-platform support, suitable for multiple operating systems.

How do you handle database upgrades in MySQL?Apr 30, 2025 am 12:28 AM

The steps for upgrading MySQL database include: 1. Backup the database, 2. Stop the current MySQL service, 3. Install the new version of MySQL, 4. Start the new version of MySQL service, 5. Recover the database. Compatibility issues are required during the upgrade process, and advanced tools such as PerconaToolkit can be used for testing and optimization.

What are the different backup strategies you can use for MySQL?Apr 30, 2025 am 12:28 AM

MySQL backup policies include logical backup, physical backup, incremental backup, replication-based backup, and cloud backup. 1. Logical backup uses mysqldump to export database structure and data, which is suitable for small databases and version migrations. 2. Physical backups are fast and comprehensive by copying data files, but require database consistency. 3. Incremental backup uses binary logging to record changes, which is suitable for large databases. 4. Replication-based backup reduces the impact on the production system by backing up from the server. 5. Cloud backups such as AmazonRDS provide automation solutions, but costs and control need to be considered. When selecting a policy, database size, downtime tolerance, recovery time, and recovery point goals should be considered.

What is MySQL clustering?Apr 30, 2025 am 12:28 AM

MySQLclusteringenhancesdatabaserobustnessandscalabilitybydistributingdataacrossmultiplenodes.ItusestheNDBenginefordatareplicationandfaulttolerance,ensuringhighavailability.Setupinvolvesconfiguringmanagement,data,andSQLnodes,withcarefulmonitoringandpe

How do you optimize database schema design for performance in MySQL?Apr 30, 2025 am 12:27 AM

Optimizing database schema design in MySQL can improve performance through the following steps: 1. Index optimization: Create indexes on common query columns, balancing the overhead of query and inserting updates. 2. Table structure optimization: Reduce data redundancy through normalization or anti-normalization and improve access efficiency. 3. Data type selection: Use appropriate data types, such as INT instead of VARCHAR, to reduce storage space. 4. Partitioning and sub-table: For large data volumes, use partitioning and sub-table to disperse data to improve query and maintenance efficiency.

How can you optimize MySQL performance?Apr 30, 2025 am 12:26 AM

TooptimizeMySQLperformance,followthesesteps:1)Implementproperindexingtospeedupqueries,2)UseEXPLAINtoanalyzeandoptimizequeryperformance,3)Adjustserverconfigurationsettingslikeinnodb_buffer_pool_sizeandmax_connections,4)Usepartitioningforlargetablestoi

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

3 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks agoByDDD

Roblox: Dead Rails - How To Tame Wolves

4 weeks agoByDDD

Strength Levels for Every Enemy & Monster in R.E.P.O.

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Roblox: Grow A Garden - Complete Mutation Guide

2 weeks agoByDDD

Hot Tools

SublimeText3 English version

Recommended: Win version, supports code prompts!

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),