La versione italiana si trova sotto quella inglese
The italian version is under the english one
Python and Hive: A Tool to Simplify Curation | 2nd Script | Work in Progress!
Here I am with the second part of my little project, the first part of which you can find here.
What is this all about?
It is about some scripts whose purpose is to simplify the curation activity of a Community or project that is interested in curating posts published with a specific tag and in a specific language.
My reference, as I work on this small project, is the Olio di Balena Community, but the scripts can be easily adapted to work with any tag and many different languages.
As a language I am using Python, which is the first programming language I have chosen to try to learn: I am still just starting out, but working on a project that has a real application helps me stay motivated and more focused.
Just with these two scripts I've written, I've learned new stuff already and have come into contact with interesting libraries and modules, the applications of which are so many.
Now, though, let's see what this second script, which you will find a little further below, does.
Purpose of the second script
While the goal of the first script is to monitor the posts that are published on the chain by picking and collecting into a document those that meet specific requirements (e.g., “ita” tag, Italian language, minimum length of 500 words), this second script instead takes care of upvoting and commenting all selected posts, moving the list that contains the posts done from the folder named “posts_to_do” to the “posts_done” one.
Thus, the only manual tasks a curator has to perform are:
- check the posts in the list by deleting those not judged to be of sufficient quality;
- decide how much to upvote each post.
In fact, since automated curation of posts is not allowed on Hive, I decided to leave the above two activities to a curator's manual activity, so as to prevent spam or low-quality posts from receiving an upvote.
And now here's the code!
Below is the code for the second of the two scripts I am working on, which is also already working:
#!/usr/bin/env python3
"""A script to upvote and comment posts from a .csv list"""
import os
import shutil
import jinja2
import configparser
import time
import re
import logging
import pandas as pd
from beem import Hive, exceptions as beem_e
from beem.comment import Comment
from beemapi import exceptions as beemapi_e
import beem.instance
# Global configuration
config = configparser.ConfigParser()
config.read("config")
ENABLE_COMMENTS = config["Global"]["ENABLE_COMMENTS"] == "True"
ENABLE_UPVOTES = config["Global"]["ENABLE_UPVOTES"] == "True"
ACCOUNT_NAME = config["Global"]["ACCOUNT_NAME"]
ACCOUNT_POSTING_KEY = config["Global"]["ACCOUNT_POSTING_KEY"]
HIVE_API_NODE = config["Global"]["HIVE_API_NODE"]
HIVE = Hive(node=[HIVE_API_NODE], keys=[config["Global"]["ACCOUNT_POSTING_KEY"]])
beem.instance.set_shared_blockchain_instance(HIVE)
# End of global config
# Logging config
logging.basicConfig(
filename="app.log",
filemode="a",
format="%(asctime)s - %(levelname)s - %(message)s",
level=logging.INFO,
)
# END Global configuration
logging.info("Configuration loaded:")
for section in config.keys():
for key in config[section].keys():
if "_key" in key:
continue # don't log posting keys
logging.info(f"{section}, {key}, {config[section][key]}")
# Markdown template for comment
comment_curation_template = jinja2.Template(
open(os.path.join("template", "comment_curation.template"), "r").read()
)
def give_upvote(post, author, vote_weight):
if ENABLE_UPVOTES:
print(f"Upvoting with weight {vote_weight}!")
post.upvote(weight=vote_weight, voter=author)
# sleep 3s before continuing
time.sleep(3)
else:
print("Upvoting is disabled")
def post_comment(post, author, comment_body):
if ENABLE_COMMENTS:
print("Commenting!")
post.reply(body=comment_body, author=author)
# sleep 3s before continuing
time.sleep(3)
else:
print("Posting is disabled")
def process_file(file_to_process):
try:
df = pd.read_csv(file_to_process)
for _, row in df.iterrows():
url = row["URL"]
vote_weight = row["Upvote_Value"]
print(f"Work in progress on {url}...")
if pd.isna(vote_weight):
print(f"No upvote value for {url}, skipping...")
continue
try:
vote_weight = int(vote_weight)
except ValueError:
print(f"Invalid vote weight: {vote_weight}")
continue
if (vote_weight < 1) or (vote_weight > 100):
print(f"Invalid vote weight: {vote_weight}%")
continue
# data of the post to be upvoted and/or replied
permlink = re.search(r".+@([\w.-]+)/([\w-]+)", url)
author_account = permlink.group(1)
post_permlink = permlink.group(2)
reply_identifier = f"{author_account}/{post_permlink}"
logging.info(f"{author_account} is getting a {vote_weight}% upvote!")
try:
post = Comment(reply_identifier, api="condenser")
except beem_e.ContentDoesNotExistsException:
logging.error("Post not found!")
continue
# leave an upvote and/or a comment
comment_body = comment_curation_template.render(
target_account=author_account,
)
try:
give_upvote(post, ACCOUNT_NAME, vote_weight)
except beem_e.VotingInvalidOnArchivedPost:
logging.error("Post is too old to be upvoted")
except beemapi_e.UnhandledRPCError:
logging.error("Vote changed too many times")
post_comment(post, ACCOUNT_NAME, comment_body)
except pd.errors.EmptyDataError:
logging.error(f"File {file_to_process} is empty. Skipping...")
finally:
# Once done, move file in the directory "urls_done"
directory_done = "posts_done"
destination = os.path.join(directory_done, os.path.basename(file_to_process))
shutil.move(file_to_process, destination)
logging.info(
f"File {os.path.basename(file_to_process)} moved to '{directory_done}' directory."
)
def main():
directory_to_do = "posts_to_do"
file_to_process = None
for filename in os.listdir(directory_to_do):
if filename.endswith(".csv"): # Only look for csv files
file_to_process = os.path.join(directory_to_do, filename)
break # One file at a time
if file_to_process:
process_file(file_to_process)
else:
logging.info("No files found in the 'urls_to_do' directory.")
if __name__ == "__main__":
main()
This time there is also a configuration file, since it is possible to choose whether the script should upvote and/or comment the selected posts, as well as specify from which account these operations should be performed:
[Global]
; Disable to stop the bot from posting comments
ENABLE_COMMENTS = True
; Disable to stop the bot from upvoting
ENABLE_UPVOTES = True
; Accounts allowed to call the bot
ACCOUNT_NAME = community_account
; Posting key for the account
ACCOUNT_POSTING_KEY = xxxx
; Hive API node to use for read/write/upvote ops
HIVE_API_NODE = https://api.deathwing.me
Finally, this is also the template to be used to setup the comment text that can be left under the selected posts:
Hi @{{target_account}}, your post has been curated by ....
Now I'd like to add a third script ( that would create a summary post, to be periodically published), refine the first script with what I learned while writing the second one (e.g., integrating the logging library), and... well, then add more new features, if I can, and make it more and more interesting!
All this will probably remain just an exercise of mine, but I'd still like to create a full fledged project that could potentially have real-world applications!
images property of their respective owners
to support the #OliodiBalena community, @balaenoptera is 3% beneficiary of this post
If you've read this far, thank you! If you want to leave an upvote, a reblog, a follow, a comment... well, any sign of life is really much appreciated!
Versione italiana
Italian version
Python e Hive: uno Strumento per Semplificare l'Attività di Curation | 2° Script | Lavori in Corso!
Eccomi con la seconda parte del mio piccolo progetto, di cui trovate la prima parte qui.
Di cosa si tratta?
Si tratta di alcuni scripts il cui scopo è semplificare l'attività di curation di una Community o progetto che sia interessato a curare post pubblicati con un determinato tag ed in una lingua specifica.
Il mio riferimento, nel mentre che lavoro a questo piccolo progetto, è la Community Olio di Balena, ma gli scripts possono essere facilmente adattati per funzionare con qualsiasi tag e molte lingue diverse.
Come linguaggio sto utilizzando Python, che è il primo linguaggio di programmazione che ho scelto di provare ad imparare: ancora sono agli inizi, ma lavorare ad un progetto che abbia un'applicazione reale mi aiuta a restare motivato e più concentrato.
Già solo con questi due scripts che ho scritto ho imparato nuove cose e sono entrato in contatto con librerie e moduli interessanti, le cui applicazioni sono davvero tante.
Ora però vediamo cosa fa questo secondo script che trovate poco più sotto.
Scopo del secondo script
Mentre l'obiettivo del primo script è quello di monitorare i post che vengono pubblicati sulla chain selezionando e raccogliendo in un documento quelli che abbiano certi requisiti (es. tag "ita", lingua italiana, lunghezza almeno 500 parole), questo secondo script si occupa invece di upvotare e commentare tutti i post selezionati, spostando la lista contenente i post fatti dalla cartella denominata "posts_to_do" alla cartella "posts_done".
In questo modo le uniche attività manuali che un curatore debba svolgere sono:
- controllare i post della lista cancellando quelli non ritenuti di qualità sufficiente;
- decidere di quanto upvotare ogni post.
Dato infatti che su Hive non è ammesso curare in maniera automatizzata i post, ho deciso di lasciare le due suddette attività all'opera manuale di un curatore, in modo da evitare che post spam o di bassa qualità possano ricevere un upvote.
Ed ecco il codice!
A seguire il codice del secondo dei due script a cui sto lavorando, anche questo già funzionante:
#!/usr/bin/env python3
"""A script to upvote and comment posts from a .csv list"""
import os
import shutil
import jinja2
import configparser
import time
import re
import logging
import pandas as pd
from beem import Hive, exceptions as beem_e
from beem.comment import Comment
from beemapi import exceptions as beemapi_e
import beem.instance
# Global configuration
config = configparser.ConfigParser()
config.read("config")
ENABLE_COMMENTS = config["Global"]["ENABLE_COMMENTS"] == "True"
ENABLE_UPVOTES = config["Global"]["ENABLE_UPVOTES"] == "True"
ACCOUNT_NAME = config["Global"]["ACCOUNT_NAME"]
ACCOUNT_POSTING_KEY = config["Global"]["ACCOUNT_POSTING_KEY"]
HIVE_API_NODE = config["Global"]["HIVE_API_NODE"]
HIVE = Hive(node=[HIVE_API_NODE], keys=[config["Global"]["ACCOUNT_POSTING_KEY"]])
beem.instance.set_shared_blockchain_instance(HIVE)
# End of global config
# Logging config
logging.basicConfig(
filename="app.log",
filemode="a",
format="%(asctime)s - %(levelname)s - %(message)s",
level=logging.INFO,
)
# END Global configuration
logging.info("Configuration loaded:")
for section in config.keys():
for key in config[section].keys():
if "_key" in key:
continue # don't log posting keys
logging.info(f"{section}, {key}, {config[section][key]}")
# Markdown template for comment
comment_curation_template = jinja2.Template(
open(os.path.join("template", "comment_curation.template"), "r").read()
)
def give_upvote(post, author, vote_weight):
if ENABLE_UPVOTES:
print(f"Upvoting with weight {vote_weight}!")
post.upvote(weight=vote_weight, voter=author)
# sleep 3s before continuing
time.sleep(3)
else:
print("Upvoting is disabled")
def post_comment(post, author, comment_body):
if ENABLE_COMMENTS:
print("Commenting!")
post.reply(body=comment_body, author=author)
# sleep 3s before continuing
time.sleep(3)
else:
print("Posting is disabled")
def process_file(file_to_process):
try:
df = pd.read_csv(file_to_process)
for _, row in df.iterrows():
url = row["URL"]
vote_weight = row["Upvote_Value"]
print(f"Work in progress on {url}...")
if pd.isna(vote_weight):
print(f"No upvote value for {url}, skipping...")
continue
try:
vote_weight = int(vote_weight)
except ValueError:
print(f"Invalid vote weight: {vote_weight}")
continue
if (vote_weight < 1) or (vote_weight > 100):
print(f"Invalid vote weight: {vote_weight}%")
continue
# data of the post to be upvoted and/or replied
permlink = re.search(r".+@([\w.-]+)/([\w-]+)", url)
author_account = permlink.group(1)
post_permlink = permlink.group(2)
reply_identifier = f"{author_account}/{post_permlink}"
logging.info(f"{author_account} is getting a {vote_weight}% upvote!")
try:
post = Comment(reply_identifier, api="condenser")
except beem_e.ContentDoesNotExistsException:
logging.error("Post not found!")
continue
# leave an upvote and/or a comment
comment_body = comment_curation_template.render(
target_account=author_account,
)
try:
give_upvote(post, ACCOUNT_NAME, vote_weight)
except beem_e.VotingInvalidOnArchivedPost:
logging.error("Post is too old to be upvoted")
except beemapi_e.UnhandledRPCError:
logging.error("Vote changed too many times")
post_comment(post, ACCOUNT_NAME, comment_body)
except pd.errors.EmptyDataError:
logging.error(f"File {file_to_process} is empty. Skipping...")
finally:
# Once done, move file in the directory "urls_done"
directory_done = "posts_done"
destination = os.path.join(directory_done, os.path.basename(file_to_process))
shutil.move(file_to_process, destination)
logging.info(
f"File {os.path.basename(file_to_process)} moved to '{directory_done}' directory."
)
def main():
directory_to_do = "posts_to_do"
file_to_process = None
for filename in os.listdir(directory_to_do):
if filename.endswith(".csv"): # Only look for csv files
file_to_process = os.path.join(directory_to_do, filename)
break # One file at a time
if file_to_process:
process_file(file_to_process)
else:
logging.info("No files found in the 'urls_to_do' directory.")
if __name__ == "__main__":
main()
Questa volta c'è anche un file di configurazione, dato che è possibile scegliere se lo script deve upvotare e/o commentare i post selezionati, nonchè specificare da quale account dovranno essere effettuate queste operazioni:
[Global]
; Disable to stop the bot from posting comments
ENABLE_COMMENTS = True
; Disable to stop the bot from upvoting
ENABLE_UPVOTES = True
; Accounts allowed to call the bot
ACCOUNT_NAME = community_account
; Posting key for the account
ACCOUNT_POSTING_KEY = xxxx
; Hive API node to use for read/write/upvote ops
HIVE_API_NODE = https://api.deathwing.me
Infine, ecco anche il template da utilizzare per impostare il testo del commento che può essere lasciato sotto i post selezionati:
Hi @{{target_account}}, your post has been curated by ....
Ora vorrei aggiungere un terzo script (che crei un post di riepilogo, da pubblicare periodicamente), rifinire il primo script con ciò che ho imparato scrivendo il secondo (ad esempio integrando la libreria logging) e... be', poi aggiungere ancora nuove funzioni, se riesco, e rendere il tutto sempre più interessante!
Probabilmente tutto ciò resterà solo un mio esercizio, ma mi piacerebbe comunque dar vita ad un progetto completo che, potenzialmente, possa avere un'applicazione reale!
immagini di proprietà dei rispettivi proprietari
a supporto della community #OliodiBalena, il 3% delle ricompense di questo post va a @balaenoptera
Se sei arrivato a leggere fin qui, grazie! Se hai voglia di lasciare un upvote, un reblog, un follow, un commento... be', un qualsiasi segnale di vita, in realtà, è molto apprezzato!
Posted Using InLeo Alpha