jAgg 0.7.1 (Beta) Released
New in jAgg 0.7.1:
New in jAgg 0.7.0:
View a history of all changes at the Change Log.
jAgg is a Java 5.0 API that supports “group by” operations on Lists of Java objects: aggregate operations such as count, sum, max, min, avg, and many more. It allows such "super aggregate" operations as rollups and cubes. It also allows custom aggregate operations. That is, one can create custom Aggregators to work with jAgg.
Today in Java there is no practical “group by” operation that imitates the corresponding database functionality mandated by the SQL language. That is, we can’t take an arbitrary List of Objects, group them according to specific object properties, and perform aggregate operations on them. There are a few parts of Java that do begin to implement a little of the desired functionality. Some of them follow here:
A programmer can always write specific code that loops over a List of Objects, extracts the desired values, performs the aggregate calculations, and returns the aggregate result. But such code is very likely to be highly coupled to existing programmer object types.
Oracle, being a relational database that supports the SQL standard, supports many aggregate functions, including many that go beyond the five basic aggregate operations mentioned above, like variance, covariance, standard deviation, correlation, linear regression, and percentile.
Oracle also allows the database user to implement custom aggregate functions, covered here.
If a database programmer creates an Oracle object type with a few specific method names, and associates this object type with the definition of a new function, then a new aggregate function is created. The object type must define methods for initialization, value iteration (processing the next row of input), merging (merging object state for parallel processing), and termination (calculation of the final result).
What if a Java programmer obtains a List of Objects, from a database or another data source, but wants to provide multiple or customizable views to summarize and/or breakdown the data? The programmer does not want to go back to the database or data source for each breakdown a user specifies. Such queries can be costly.
A mechanism to obtain the data once, and then process aggregate functions in any manner in memory is more desirable in this case.
Primary Actor: Statistical Analyzer
Stakeholders and Interests:
Preconditions: A statistical analyzer has a List of values to analyze with one or more aggregate operations. Built-in operations include, but are not limited to, standard aggregate operations such as average, count, max, min, and sum.
Success guarantee: The aggregation engine generates correct values for each desired aggregate operation, or it throws a RuntimeException that indicates why an operation could not be performed.
Main Success Scenario:
Alternative Flows:
4a. A specified "group-by" or aggregation property name is invalid, a specified property is inappropriate for a performed aggregation, or an Exception is generated by an Aggregator. No non-null values does not represent an alternative flow.
Technology and Data Variations List:
2a. The statistical analyzer may indicate that the aggregations engine should use one or more custom Aggregator objects to generate custom aggregate values.