I am trying to aggregate some rows in my pyspark dataframe based on a condition. Here is my dataframe:
When I was searching for a memory related issue in spark, I came across this article, which is suggesting to redu开发者_运维知识库ce the number of cores per executor, but in the same article it\'s men
I am fetching an RDBMS table using JDBC with some 10-20 partitions using ROW_NUM. Then from each of these partitions I want to process/format the data, and write one or more files out to file storage
when insert date into cassandra, I use spark structured stream kafka-cassandra. data insert no problem. but when i load data. sometime occurred error, SQL Error: com.datastax.driver.core.exceptions.Un
I define val like this : val config = Config(args) val product_type = config.product_type thenI send product_type as "AA"