web
You’re offline. This is a read only version of the page.
close
Skip to main content

Data Engineering

New

MySQL data import. tinyint incorrectly interpreted as bit. bug?

Vote (1) Share
Nico Timmerman's profile image

Nico Timmerman on 10 Oct 2024 00:56:16

Hello,


Today I ran into an issue that looks like a bug to me.

When using a notebook to read from an external MySQL database. one of the tables that I import into the bronze layer has a tinyint column in the source table. however, when I read this table, it incorrectly interprets this column as a bit. resulting in incorrect data in my bronze layer (in the source tables, I have ones and twos but in the bronze layer, everything is converted to a 1)

Any clue on how to fix this?


Here's a snippet of the code to build the jdbc connection

  url = f"jdbc:mysql://{datasource}?enabledTLSProtocols=TLSv1.2&serverTimezone=UTC"        

    query = getQuery(tableInfo,table_name)  

    return spark.read.format("jdbc").options(

            url=url,

            driver="com.mysql.cj.jdbc.Driver",

            query=query,


The problem is that the queries are generated dynamically, so I can't easily force the schema to a certain datatype


Thanks