Welcome!

Article

Examples to Demonstrate Why Cell Coding Overpowers Text Coding

How Agile esProc Syntax Facilitates Hadoop Coding

In the previous article, I’ve shared some experiences in Hadoop coding with the agile esProc syntax. This article is the supplementary and in-depth discussion based on the previous one.

Firstly, let’t talk about the Cellset Code.

In the previous article, I ‘ve introduced the convenience of using cellset code to define variable, make reference to variable, and achieve the complex computation goal in multiple steps. In facts, the cellset or grid can be used to make it more simple to reuse the computational result. Please refer to the code block below:

esproc

As can be seen, the computational result in A2 is reused in B2 and A3.

The introduction of grid line in the cellset is a good idea. The grid line can keep the code lines aligned naturally, for example, form a clear and intuitive work scope by indentation. Take the below code for example:

esproc

Look good. The branch of judgment statement can be recognized well. The code block appears clear and neat without the deliberate edits.

Then, let’s talk about the Object Reference. What is the object reference? Take a previous code snippet for example: A10: =A9. sort(sumAmount: -1). select(#<=10),

The code in A10 can be rewritten in two cells separately, one for sorting, and another for filtering. But in the actual given code, the “,” is used to consolidate the computations of these two steps - this mechanism is referred to as the Object Reference. Object Reference reduces the workload of coding and result in the more agile coding.

Support for direct writing the SQL Statement

The big data computation usually involves the access to Hive database or traditional database. MapReduce requires users to write the complex connect\statement\result statement, while esProc supports direct composing the SQL statement to saves users all these troubles. For example, to get the sales record from the the data source HData of a Hive database, esProc enables users to complete all work with one statement: $(HData)select * from sales.

Function options

Firstly, let’s check out these two statements in the sample code from the first article:

? Code for node machine A2: =A1. groups(${gruopField};${method}(${sumField}): Amount)

? Code for summary machine A9: =A8. [email protected](${gruopField};${method}(Amount): sumAmount)

The former one uses the groups directly to group the unsorted data. The latter one uses the @o option to indicate that the sorted data have been grouped for a much higher speed. @o is a function option to reduce the complex function of heavy workload and make it easier to memorize the names of various functions to achieve different functions. In addition to @o, there are @m and @n function options of the groups function

The function option is a nice design to make the function structure much simplier, and the coding more agile.

Multi-level Parameter

The multi-level parameter (or hierarchy parameter by name) can make the syntax much agile. This is a way to represent the parameters at different levels of the function, for example, ranking the employee by its performance score:

? If the performance score is higher than 90, then set it to “A”

? If the performance score is between 90 and 60, then set it to “B”

? If the performance score is between 60 and 30, then set it to “C”

? If the performance score is below 30, then set it to “D”

In the esProc, the above parameters can be represented like this: score>90:" A",score>60 && score< 90:" B",score>30 && score<=60:" C";"D"

In this case, the parameter can be classified into three levels, and the outermost level: The branch and the default branch is separated with “;”; The middle level: Each branch is separated with “,”; The innermost level: The judgment expressions and results in each branch are separated with “: “. This is a parameter combination of three-level tree structure.

Set-style Grouping

esProc supports the set-style grouping, and is also capable of coding in an agile way. The essence of dynamic data type is the set. Specifically, the simple data type is the set of single value, the array is the set of alike data, and the two dimensional table is the set of records. The member of a set can be another set. Therefore, esProc can be used to represent the concept of grouping in the data computation: Each group is a member of a set, and the member itself is a set. Thanks to the agile syntax, the set-style grouping can be used to solve the complex grouping and computational problems. For example, find the sales person who signed the most and the least insurance policies. The code is as shown below:

esproc

A1 cell: Group by sales person. Each group is a set of all policies of one sales person.

A2 cell: Sort the group by the number of policies. In the code snippet, the “~” represents a group of policies corresponding to each sales person.

A3 cell: Find the groups having the most or the least polices. They are the first group and the last group in cell A2.

A4 cell: List the name of sales person. They are the sales persons corresponding to the two groups of policies in A3.

 

The agile syntax of esProc boosts the efficiency of code development, and reduces the development workload dramatically.

Web: http://www.raqsoft.com/product-esproc

More Stories By Jessica Qiu

Jessica Qiu is the editor of Raqsoft. She provides press releases for data computation and data analytics.

Latest Stories
Detecting internal user threats in the Big Data eco-system is challenging and cumbersome. Many organizations monitor internal usage of the Big Data eco-system using a set of alerts. This is not a scalable process given the increase in the number of alerts with the accelerating growth in data volume and user base. Organizations are increasingly leveraging machine learning to monitor only those data elements that are sensitive and critical, autonomously establish monitoring policies, and to detect...
"We're a cybersecurity firm that specializes in engineering security solutions both at the software and hardware level. Security cannot be an after-the-fact afterthought, which is what it's become," stated Richard Blech, Chief Executive Officer at Secure Channels, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes a lot of work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reduction in cost ...
Enterprise architects are increasingly adopting multi-cloud strategies as they seek to utilize existing data center assets, leverage the advantages of cloud computing and avoid cloud vendor lock-in. This requires a globally aware traffic management strategy that can monitor infrastructure health across data centers and end-user experience globally, while responding to control changes and system specification at the speed of today’s DevOps teams. In his session at 20th Cloud Expo, Josh Gray, Chie...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. Jack Norris reviews best practices to show how companies develop, deploy, and dynamically update these applications and how this data-first...
Intelligent Automation is now one of the key business imperatives for CIOs and CISOs impacting all areas of business today. In his session at 21st Cloud Expo, Brian Boeggeman, VP Alliances & Partnerships at Ayehu, will talk about how business value is created and delivered through intelligent automation to today’s enterprises. The open ecosystem platform approach toward Intelligent Automation that Ayehu delivers to the market is core to enabling the creation of the self-driving enterprise.
"At the keynote this morning we spoke about the value proposition of Nutanix, of having a DevOps culture and a mindset, and the business outcomes of achieving agility and scale, which everybody here is trying to accomplish," noted Mark Lavi, DevOps Solution Architect at Nutanix, in this SYS-CON.tv interview at @DevOpsSummit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We're here to tell the world about our cloud-scale infrastructure that we have at Juniper combined with the world-class security that we put into the cloud," explained Lisa Guess, VP of Systems Engineering at Juniper Networks, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Historically, some banking activities such as trading have been relying heavily on analytics and cutting edge algorithmic tools. The coming of age of powerful data analytics solutions combined with the development of intelligent algorithms have created new opportunities for financial institutions. In his session at 20th Cloud Expo, Sebastien Meunier, Head of Digital for North America at Chappuis Halder & Co., discussed how these tools can be leveraged to develop a lasting competitive advantage ...
As businesses adopt functionalities in cloud computing, it’s imperative that IT operations consistently ensure cloud systems work correctly – all of the time, and to their best capabilities. In his session at @BigDataExpo, Bernd Harzog, CEO and founder of OpsDataStore, presented an industry answer to the common question, “Are you running IT operations as efficiently and as cost effectively as you need to?” He then expounded on the industry issues he frequently came up against as an analyst, and ...
The question before companies today is not whether to become intelligent, it’s a question of how and how fast. The key is to adopt and deploy an intelligent application strategy while simultaneously preparing to scale that intelligence. In her session at 21st Cloud Expo, Sangeeta Chakraborty, Chief Customer Officer at Ayasdi, will provide a tactical framework to become a truly intelligent enterprise, including how to identify the right applications for AI, how to build a Center of Excellence to ...
In his session at 20th Cloud Expo, Mike Johnston, an infrastructure engineer at Supergiant.io, discussed how to use Kubernetes to set up a SaaS infrastructure for your business. Mike Johnston is an infrastructure engineer at Supergiant.io with over 12 years of experience designing, deploying, and maintaining server and workstation infrastructure at all scales. He has experience with brick and mortar data centers as well as cloud providers like Digital Ocean, Amazon Web Services, and Rackspace. H...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
SYS-CON Events announced today that Massive Networks will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Massive Networks mission is simple. To help your business operate seamlessly with fast, reliable, and secure internet and network solutions. Improve your customer's experience with outstanding connections to your cloud.
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...