Zebra Change Log

Trunk (unreleased changes)

  INCOMPATIBLE CHANGES

    PIG-1455 Addition of test-unit as an ant target (yanz)

    PIG-1451 Change the build.test property in build to test.build.dir to be consistent with PIG (yanz)

    PIG-1444 Addition of test-smoke ant target (gauravj via yanz)

    PIG-1357 Addition of Test cases of map-side GROUP-BY (yanz)

    PIG-1282 make Zebra's pig test cases run on real cluster (chaow via yanz)

    PIG-1164 Addition of smoke tests (gauravj via yanz)

    PIG-1122 Changed version number of pig dev core jar used in Zebra build from 0.6.0 to 0.7.0
        to match Pig version number (yanz)

    PIG-1099 Changed version number to be 0.7.0 to match Pig version number
	change (yanz via gates)

  IMPROVEMENTS

    PIG-1425 support of source table index on unsorted table in the mapred APIs (yanz)

    PIG-1375 Support of multiple Zebra table writing through Pig (chaow via yanz)

    PIG-1351 Addition of type check when writing to basic table (chaow via yanz)

    PIG-1361 Zebra TableLoader.getSchema() should return the projectionSchema specified in the constructor of TableLoader instead of pruned proejction by pig (gauravj via daijy)

    PIG-1291 Support of virtual column "source_table" on unsorted table (yanz)

    PIG-1315 Implementing OrderedLoadFunc interface for Zebra TableLoader (xuefux via yanz)

    PIG-1306 Support of locally sorted input splits (yanz)

    PIG-1268 Need an ant target that runs all pig-related tests in Zebra (xuefuz via yanz)

    PIG-1207 Data sanity check should be performed at the end of writing instead of later at query time (yanz)

    PIG-1206 Storing descendingly sorted PIG table as unsorted table (yanz)

    PIG-1240 zebra manifest file enhancement (gauravj via yanz)

    PIG-1140 Support of Hadoop 2.0 API (xuefuz via yanz)

    PIG-1170 new end-to-end and stress test cases (jing1234 via yanz)

    PIG-1136 Support of map split on hash keys with leading underscore (xuefuz via yanz)

    PIG-1125 Map/Reduce API Changes (Chao Wang via yanz)

    PIG-1104 Streaming Support (Chao Wang via yanz)

    PIG-1119 Support of "group" as a column name (Gaurav Jain via yanz)

    PIG-653  Pig Projection Push Down (Gaurav Jain via yanz)

    PIG-1111 Multiple Outputs Support (Gaurav Jain via yanz)

    PIG-1098 Zebra Performance Optimizations (yanz via gates)

	PIG-1074 Zebra store function should allow '::' in column names in output
	schema (yanz via gates)

    PIG-1077 Support record(row)-based file split in Zebra's
	TableInputFormat (chaow via gates)

    PIG-1089 Use Pig's version for Zebra's own versi (chaow via olgan)

    PIG-1069  Order Preserving Sorted Table Union (yanz via gates)

    PIG-997 Sorted Table Support by Zebra (yanz via gates)
  	
	PIG-996 Add findbugs, checkstyle, and clover to zebra build file (chaow via
	        gates)

	PIG-993 Ability to drop a column group in a table (yanz and rangadi via gates)

    PIG-992 Separate schema related files into a schema package (yanz via
	gates)

  OPTIMIZATIONS

    PIG-1198: performance improvements through use of unsorted input splits that span multiple files (yanz)

  BUG FIXES

    PIG-1453 Intermittent failure in pigtest (yanz)

    PIG-1432 There are some debuging info output to STDOUT in PIG's TableStorer call path (yanz)

    PIG-1421 Name node calls made by each mapper (xuefuz via yanz)

    PIG-1342 Avoid making unnecessary name node calls for writes in Zebra (chaow via yanz) 

    PIG-1356 TableLoader makes unnecessary calls to build a Job instance that create a new JobClient in the hadoop 0.20.9 (yanz)

    PIG-1349 Hubson test failure in test case TestBasicUnion (xuefuz via yanz)

    PIG-1340 The zebra version number should be changed from 0.7 to 0.8 (yanz)

    PIG-1318 Invalid type for source_table field when using order-preserving Sorted Table Union (gauravj via yanz)

    PIG-1258 Number of sorted input splits is unusually high (yanz)

    PIG-1269 Restrict schema definition for collection (xuefuz via yanz)

    PIG-1253: make map/reduce test cases run on real cluster (chaow via yanz)

    PIG-1276: Pig resource schema interface changed, so Zebra needs to catch exception thrown from the new interfaces. (xuefuz via yanz)

    PIG-1256: Bag field should always contain a tuple type as the field schema in ResourceSchema object converted from Zebra Schema (xuefuz via yanz)

    PIG-1227: Throw exception if column group meta file is missing for an unsorted table (yanz)

    PIG-1201: unnecessary name node calls by each mapper; too big input split serialization size by Pig's Slice implementation (yanz)

    PIG-1115: cleanup of temp files left by failed tasks (gauravj via yanz)

    PIG-1167: Hadoop file glob support (yanz)

    PIG-1153: Record split exception fix (yanz)

    PIG-1145: Merge Join on Large Table throws an EOF exception (yanz)

    PIG-1095: Schema support of anonymous fields in COLECTION fails (yanz via
	gates)

    PIG-1078: merge join with empty table failed (yanz via gates)

    PIG-1091: Exception when load with projection of map keys on a map column
	that is not map split (yanz via gates).

    PIG-1026: [zebra] map split returns null (yanz via pradeepkth)

	PIG-1057 Zebra does not support concurrent deletions of column groups now
	(chaow via gates).

	PIG-944  Change schema to be taken from StoreConfig instead of
	TableStorer's constructor (yanz via gates).

    PIG-918. Fix infinite loop only columns in first column group are
    specified. (Yan Zhou via rangadi)
 
    PIG-949. If an entire map is placed in non default column group, 
    and a specific key placed in another CG, the second CG did not 
    work as expected. (Yan Zhou, Jing Huang (tests) via rangadi)

    PIG-987. Access control for column groups. Users can specifify
    desired group, owner, and, permissions for a column group.
    (Yan Zhou via rangadi)
    
    PIG-991. Various minor bugs. (Yan Zhou via rangadi)

    PIG-986. Column groups can have explicit names specified in
    storage hint. (Yan Zhou via rangadi)

