spark-jobserver提供了一个用于提交和管理Apache Spark作业(job)、jar文件和作业上下文(SparkContext)的RESTful接口。该项目位于git(https://github.com/ooyala/spark-jobserver),当前为0.4版本。 特性 Spark as a Service: 简单的面向job和context管理
spark-jobserver提供了一个用于提交和管理Apache Spark作业(job)、jar文件和作业上下文(SparkContext)的RESTful接口。该项目位于git(https://github.com/ooyala/spark-jobserver),当前为0.4版本。
特性
“Spark as a Service”: 简单的面向job和context管理的REST接口
通过长期运行的job context支持亚秒级低延时作业(job)
可以通过结束context来停止运行的作业(job)
分割jar上传步骤以提高job的启动
异步和同步的job API,其中同步API对低延时作业非常有效
支持Standalone Spark和Mesos
Job和jar信息通过一个可插拔的DAO接口来持久化
命名RDD以缓存,并可以通过该名称获取RDD。这样可以提高作业间RDD的共享和重用
安装并启动jobServer
jobServer依赖sbt,所以必须先装好sbt。
rpm -ivh https://dl.bintray.com/sbt/rpm/sbt-0.13.6.rpm yum install git # 下面clone这个项目 SHELL$ git clone https://github.com/ooyala/spark-jobserver.git # 在项目根目录下,进入sbt SHELL$ sbt ...... [info] Set current project to spark-jobserver-master (in build file:/D:/Projects /spark-jobserver-master/) > #在本地启动jobServer(开发者模式) >re-start --- -Xmx4g ...... #此时会下载spark-core,jetty和liftweb等相关模块。 job-server Starting spark.jobserver.JobServer.main() [success] Total time: 545 s, completed 2014-10-21 19:19:48
然后访问http://localhost:8090 可以看到Web UI
?
测试job执行
这里我们直接使用job-server的test包进行测试
SHELL$ sbt job-server-tests/package ...... [info] Compiling 5 Scala sources to /root/spark-jobserver/job-server-tests/target/classes... [info] Packaging /root/spark-jobserver/job-server-tests/target/job-server-tests-0.4.0.jar ... [info] Done packaging.
编译完成后,将打包的jar文件通过REST接口上传
REST接口的API如下:
GET /jobs
查询所有job
POST /jobs
提交一个新job
GET /jobs/<jobid></jobid>
查询某一任务的结果和状态
GET /jobs/<jobid>/config</jobid>
SHELL$ curl --data-binary @job-server-tests/target/job-server-tests-0.4.0.jar localhost:8090/jars/test OK # 查看提交的jar SHELL$ curl localhost:8090/jars/ { "test": "2014-10-22T15:15:04.826+08:00" } # 提交job 提交的appName为test,class为spark.jobserver.WordCountExample SHELL$ curl -d "input.string = hello job server" 'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCountExample' { "status": "STARTED", "result": { "jobId": "34ce0666-0148-46f7-8bcf-a7a19b5608b2", "context": "eba36388-spark.jobserver.WordCountExample" } } # 通过job-id查看结果和配置信息 SHELL$ curl localhost:8090/jobs/34ce0666-0148-46f7-8bcf-a7a19b5608b2 { "status": "OK", "result": { "job": 1, "hello": 1, "server": 1 } SHELL$ curl localhost:8090/jobs/34ce0666-0148-46f7-8bcf-a7a19b5608b2/config { "input" : { "string" : "hello job server" } # 提交一个同步的job,当执行命令后,terminal会hang住直到任务执行完毕。 SHELL$ curl -d "input.string = hello job server" 'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCountExample'&sync=true { "status": "OK", "result": { "job": 1, "hello": 1, "server": 1 }
在Web UI上也可以看到Completed Jobs相应的信息。
预先启动Context
和Context相关的API
GET /contexts
?查询所有预先建立好的context
POST /contexts
?建立新的context
DELETE ?/contexts/<name></name>
?删除此context,停止运行于此context上的所有job
SHELL$ curl -d "" 'localhost:8090/contexts/test-context?num-cpu-cores=4&mem-per-node=512m' OK # 查看现有的context curl localhost:8090/contexts ["test-context", "feceedc3-spark.jobserver.WordCountExample"] 接下来在这个context上执行job curl -d "input.string = a b c a b see" 'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCountExample&context=test-context&sync=true' { "status": "OK", "result": { "a": 2, "b": 2, "c": 1, "see": 1 }
配置文件
打开配置文件,可以发现master设置为local[4],可以将其改为我们的集群地址。
vim spark-jobserver/config/local.conf.template master = "local[4]"
此外,关于数据对象的存储方法和路径:
jobdao = spark.jobserver.io.JobFileDAO filedao { rootdir = /tmp/spark-job-server/filedao/data }
默认context设置,该设置可以被
下面再次在sbt中启动REST接口的中的参数覆盖。
# universal context configuration. These settings can be overridden, see README.md context-settings { num-cpu-cores = 2 # Number of cores to allocate. Required. memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, #1G, etc. # in case spark distribution should be accessed from HDFS (as opposed to being installed on every mesos slave) # spark.executor.uri = "hdfs://namenode:8020/apps/spark/spark.tgz" # uris of jars to be loaded into the classpath for this context # dependent-jar-uris = ["file:///some/path/present/in/each/mesos/slave/somepackage.jar"] }
基本的使用到此为止,jobServer的部署和项目使用将之后介绍。顺便期待下一个版本SQL Window的功能。
^^
原文地址:Spark as a Service之JobServer初测, 感谢原作者分享。

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

MySQL is suitable for small and large enterprises. 1) Small businesses can use MySQL for basic data management, such as storing customer information. 2) Large enterprises can use MySQL to process massive data and complex business logic to optimize query performance and transaction processing.

InnoDB effectively prevents phantom reading through Next-KeyLocking mechanism. 1) Next-KeyLocking combines row lock and gap lock to lock records and their gaps to prevent new records from being inserted. 2) In practical applications, by optimizing query and adjusting isolation levels, lock competition can be reduced and concurrency performance can be improved.

MySQL is not a programming language, but its query language SQL has the characteristics of a programming language: 1. SQL supports conditional judgment, loops and variable operations; 2. Through stored procedures, triggers and functions, users can perform complex logical operations in the database.

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

MySQL is an open source relational database management system suitable for data storage, management, query and security. 1. It supports a variety of operating systems and is widely used in Web applications and other fields. 2. Through the client-server architecture and different storage engines, MySQL processes data efficiently. 3. Basic usage includes creating databases and tables, inserting, querying and updating data. 4. Advanced usage involves complex queries and stored procedures. 5. Common errors can be debugged through the EXPLAIN statement. 6. Performance optimization includes the rational use of indexes and optimized query statements.

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

InnoDB's lock mechanisms include shared locks, exclusive locks, intention locks, record locks, gap locks and next key locks. 1. Shared lock allows transactions to read data without preventing other transactions from reading. 2. Exclusive lock prevents other transactions from reading and modifying data. 3. Intention lock optimizes lock efficiency. 4. Record lock lock index record. 5. Gap lock locks index recording gap. 6. The next key lock is a combination of record lock and gap lock to ensure data consistency.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

SublimeText3 Linux new version
SublimeText3 Linux latest version

SublimeText3 Chinese version
Chinese version, very easy to use