برندینگ و برندسازی: python - Pyspark "cannot resolve '`countryName`' given input columns: [countryName, city, age]

۱۳۹۷ فروردین ۹, پنجشنبه

python - Pyspark "cannot resolve '`countryName`' given input columns: [countryName, city, age]

[ad_1]

I'm reading from a json file using pyspark as follows:

raw = sc.textFile(path)
dataset_df = sqlContext.read.json(raw)

So to select only specific keys from the json file (if the key is present), I use:

dataset_df.select('countryName', 'city', 'age')

However, I get the following error from running the line above:

"cannot resolve 'countryName' given input columns: [countryName', 'city', "age"]n"

I get a similar error when I remove countryName from the list of keys to read from the csv. I have tested on other keys from the json file, for some, the code above runs without issues but for specific columns I get the error shown above.

Does anyone know what could be the reason behind this?

Thanks in advance.

[ad_2]

لینک منبع

دنبال کننده ها

۱۳۹۷ فروردین ۹, پنجشنبه

python - Pyspark "cannot resolve '`countryName`' given input columns: [countryName, city, age]