The content of this page has been automatically translated by AI. If you encounter any problems while reading, you can view the corresponding content in Chinese.

Function Development

Last updated: 2026-05-19 14:24:49

1. Log in to WeData Console, and enter the function development page.
2. Click Project List in the left menu to find the target project for function development feature operations.
3. After selecting the project, click to enter the Data Development module.
4. Click Function Development in the left menu.

Function Overview

UDF functions uploaded in the Resource Management feature can be used in Function Development. Define the function's classification and class name, and you can use it during data development. Currently, Hive SQL, Spark SQL, and DLC SQL function creation are supported.

Creating function

1. On the function development page, click

to choose to create a new Hive SQL function, Spark SQL function, or DLC SQL function. You can also create the corresponding type of function by clicking the

button on the right side of the target path under the specific function type.



2. Configure the function in the pop-up window and click Save and submit to complete the function registration.



The configuration information is shown in the table below:
Information
Description
Categorizing functions
Create functions in preset categories by their nature. Function categories include: analysis function, encryption function, aggregate function, logical function, date and time function, mathematical function, conversion function, string function, IP and domain functions, window function, and other functions.
Class name
Enter the class name of the function.
function file
Select the domain names or IP addresses of the source file for the function.
Select resource file: Choose function file from jar or zip resources uploaded via resource management feature.
Specify COS path: Obtain function file from platform COS bucket path.
resource file
The function file option is select resource file, and the required function file needs to be specified in the resource management directory.
COS Path
The function file option is specify COS path, and you need to enter the path of the function file in the platform COS bucket.
Command Syntax
Format: function name(input parameters). For example: the sum function command format is sum(col).
Usage Instructions
Usage instructions for custom functions. For example: the sum function usage is to calculate the aggregate value.
Parameter Description
Parameter description for custom functions. For example: the sum function parameter description is: col: required. Column values can be DOUBLE, DECIMAL or BIGINT type. If the input is STRING type, it will be implicitly converted to DOUBLE type before the operation.
Return Values
Description of function return value for custom functions. For example: the sum function return value is: return DOUBLE type.
Example
Example of a custom function. For example, the sum function example is: calculate the total amount of product sales, and the command example is: select sum(sales) from table.
3. After the function changes, you can save the history record with version features, including version number, submitter, submission time, change type, remarks, and support the operation of rolling back to the historical version.


Function Example

Spark SQL Function Development Example

1. Create a project
Create a Maven project and introduce hive-exec dependencies. Use the mvn command line to create the project, or you can create it through the IDEA tool, where groupId and artifactId are replaced with your own definition name.
mvn archetype:generate -DgroupId=com.example -DartifactId=demo-hive -Dveriosn=1.0-SNAPSHOT -Dpackage=com.example

2. Writing code
Introduce hive-exec and junit test pom dependencies.
<dependencies>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>2.3.8</version>
<exclusions>
<exclusion>
<groupId>org.pentaho</groupId>
<artifactId>pentaho-aggdesigner-algorithm</artifactId>
</exclusion>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty.orbit</groupId>
<artifactId>javax.servlet</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
</dependencies>
Create a Java class in the src/main/java/com/example directory, extend the org.apache.hadoop.hive.ql.exec.UDF class, write the evaluate method, and implement the specific behavior of the self-defined function, e.g., convert the input string to uppercase.
package com.example;

import org.apache.hadoop.hive.ql.exec.UDF;

public class UppercaseUDF extends UDF {
public String evaluate(String input) {
return input.toUpperCase();
}
}
3. Compiling and Packaging
Import the Maven packaging plugin. In the project root path, run the mvn package command for compilation and packaging. The generated package name is: demo-hive-1.0-SNAPSHOT.jar.
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<!--(start) for package jar with dependencies -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<configuration>
<archive>
<!--Specify the class where the main method is located-->
<manifest>
<mainClass>com.example.UppercaseUDF</mainClass>
</manifest>
</archive>
<!--Do not change jar-with-dependencies-->
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- bind to the packaging phase -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<!--(end) for package jar with dependencies -->
</plugins>
</build>

<repositories>
<repository>
<id>alimaven</id>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
</repository>
</repositories>
Run the mvn package command:
mvn package -Dmaven.test.skip=true
4. Function Operations
Enter the WeData function development page, create a custom function, fill in the fully qualified class path name: com.example.UppercaseUDF, select the corresponding resource file, which is the jar package implementing the custom function code. If there is no resource file, create the resource first.
4.1 Resource Upload:
Upload the demo-hive-1.0-SNAPSHOT.jar function package through the resource management feature.



4.2 Function Creation:
Create a Spark SQL function through the function development feature.

Example Function Information:
Information
Content
Categorizing functions
other functions
Class name
com.example.UppercaseUDF
function file
Select resource file
resource file
demo-hive-1.0-SNAPSHOT.jar
Command Syntax
UppercaseUDF(col)
Usage Instructions
Convert input string to uppercase format
Parameter Description
Input string type parameter
Return Values
Output string uppercase
4.3 Function Usage:
In the Development Space, create a new SQL file and use the successfully created function to verify its feature.


DLC SQL Function Development Example

You can use the above UppercaseUDF example to perform DLC function creation and use.