Skip to content

feat: add data_converter_utils, field_type_utils, file_type and decim…#38

Open
lszskye wants to merge 1 commit into
apache:mainfrom
lszskye:p3.7
Open

feat: add data_converter_utils, field_type_utils, file_type and decim…#38
lszskye wants to merge 1 commit into
apache:mainfrom
lszskye:p3.7

Conversation

@lszskye
Copy link
Copy Markdown
Contributor

@lszskye lszskye commented Jun 2, 2026

Purpose

Introduce utility classes for data type conversion, field type introspection, file type classification, and decimal operations.

Changes

DataConverterUtils: Provides bidirectional converters between string values and BinaryRow fields

FieldTypeUtils: Utility for FieldType, e.g., integer type checking, scale comparison

FileType: Enum class defining file types: kMeta, kData, kBucketIndex, kGlobalIndex, kFileIndex.

DecimalUtils: validates Arrow decimal type constraints

Tests

  • DataConverterUtilsTest
  • FieldTypeUtilsTest
  • FileTypeTest
  • DecimalUtilsTest

Copy link
Copy Markdown
Contributor

@leaves12138 leaves12138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the migration. I found one blocker in DataConverterUtils: non-legacy partition string conversion for negative FLOAT/DOUBLE values uses scientific notation because the fixed-format branch only accepts positive values. This breaks round-trip/path compatibility for negative floating partition values.

template <typename T>
static std::string FloatValueToString(const T& value, int32_t precision) {
std::stringstream oss;
if (value >= 1e-3 && value <= 1e7) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition should use the magnitude rather than the signed value. As written, negative values such as -233.0 or -467.66472 take the scientific branch and are emitted as -2.33E2 / -4.6766472E2, while positive values in the same range are emitted in fixed decimal form. That makes CreateBinaryRowFieldToStringConverter fail to round-trip negative FLOAT/DOUBLE partition values and can produce Java-incompatible partition path names. Please use std::abs(value) for the range check and add negative float/double coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants