E2.2: Welcome Agentic

Agentic is live - an Agentic Workflow that scrapes Reddit, formats the scraped content, and posts its findings on X daily. Check it out at X: @4GENTIC.

Apr 11, 2024

Welcome back to The Application Layer. Remember how we talked about a future where systems could do almost anything for us, just like in that movie “Her”? Well, today's not about dreaming anymore—it's about making some of that stuff happen for real. We're going to roll up our sleeves and get our hands dirty with a project that's as exciting as it is useful. Imagine having a little digital helper that can read stuff from the internet, make it look nice, and share customised content to your favorite medium, all by itself. Inspired by this tutorial, I'm here to show you how we can do this with scraping Reddit, formatting the content, and posting in on X.

Out of personal interest, I go down the rabbit hole of crypto. We specify the Agent to scrape the CryptoCurrency subreddit. This subreddit is highly active, contains valuable information but it takes time to split through all the nonsense. The Agent is doing this for me. The result of this is the creature that has come to life - Agentic (4GENTIC)

For this adventure, we're teaming up with Crew AI, a AI Agent framework created by João Moura. It's open for everyone to use, which means you, me, and anyone with a curious mind can see how it works and even tweak it to do new things. Let's explore how to put together our own little digital assembly line using Crew AI, and remember, this is just the start. The real magic happens when you start playing with these ideas yourself. Let's get started!

How to build LLM agent to automate your code review workflow using crewAI?

Source: ionia.ai

Crew AI: An Open Source Agent Framework

Crew AI offers a framework specifically designed to orchestrate the creation and management of AI agents. At its core, Crew AI facilitates the development of autonomous agents capable of performing a wide range of tasks, from data scraping to content formatting and even social media management.

By leveraging Crew AI, users can assemble teams of digital agents, each with specialised roles or tasks. These could include an Explorer Crew tasked with gathering data from the internet, a Formatting Crew responsible for organising this data in a readable or presentable format, and a Social Media Crew designed to distribute content across various social media platforms. This modular approach allows for flexibility and customisation, catering to the specific needs or goals of any project.

The framework's design encourages experimentation and innovation, serving not just as a tool but as a platform for learning and exploration in the field of AI. Whether it's for educational purposes, personal projects, or professional development, Crew AI offers a solid foundation for building and deploying AI agents in the real world.

Tutorial Overview: Scraping Reddit, Formatting, and Posting on X

The plan is clear. There are three tasks: Browse Reddit, format the output and post to X. I will create a single Crew to browse Reddit and format the output. It doesn’t require any creativity to post the result to X. Therefore, I apply a posting function to the formatted output instead of adding it to the workflow.

The steps to take are:

Browse Reddit: create Agent and Task
Format the output: create Agent and Task
Kickoff the Crew
Post to X

The project consists of four files:

main.py which contains the workflow
custom_functions.py which contains the X posting code
.env which contains your environment API keys
requirements.txt which contains all the necessary packages

Browse Reddit: create Agent and Task

We need to setup a BrowseTool class to prepare the browsing:

class BrowserTool:
    @tool("Scrape reddit content")
    def scrape_reddit(max_comments_per_post=7):
        """Useful to scrape a reddit content"""
        reddit = praw.Reddit(
            client_id=client_id_re,
            client_secret=client_secret_re,
            user_agent="user-agent",
        )
        subreddit = reddit.subreddit("CryptoCurrency")
        scraped_data = []

        for post in subreddit.hot(limit=12):
            post_data = {"title": post.title, "url": post.url, "comments": []}

            try:
                post.comments.replace_more(limit=0)  # Load top-level comments only
                comments = post.comments.list()
                if max_comments_per_post is not None:
                    comments = comments[:7]

                for comment in comments:
                    post_data["comments"].append(comment.body)

                scraped_data.append(post_data)

            except praw.exceptions.APIException as e:
                print(f"API Exception: {e}")
                time.sleep(60)  # Sleep for 1 minute before retrying

        return scraped_data

Define an Agent (explorer) and a Task (task_report):

explorer = Agent(
    role="Senior Researcher",
    goal="Find and explore the most exciting developments on CryptoCurrency subreddit today",
    backstory="""You are and Expert strategist that knows how to spot emerging trends and important events in Crypto, blockchain and web3 TODAY. 
    You're great at finding exciting projects on CryptoCurrency subreddit. You turned scraped data into detailed reports with titles
    of the most exciting developments in the crypto world. ONLY use scraped data from CryptoCurrency subreddit for the report. Make sure not to forget the links to the posts.
    """,
    verbose=True,
    allow_delegation=False,
    tools=[BrowserTool().scrape_reddit] + human_tools,
   llm=llm,  
)

task_report = Task(
    description="""Use and summarize scraped data from subreddit CryptoCurrency to make a detailed report on today's developments in crypto. Use ONLY 
    scraped data from CryptoCurrency to generate the report. Your final answer MUST be a full analysis report, text only, ignore any code or anything that 
    isn't text except for links. The report has to have bullet points and with 5-10 exciting crypto developments. 
    Each bullet point MUST contain 3 sentences that refer to one specific development you found on subreddit CryptoCurrency. Each bullet point must contain a link to the post.  
    """,
    agent=explorer,
    expected_output="""A detailed report with bullet points each containing 3 sentences about a specific crypto development found on the CryptoCurrency subreddit."""
)

Format the output: create Agent and Task

Define an Agent (formatter) and a Task (task_formatter):

formatter = Agent(
    role="Expert Writing formatter",
    goal="Make sure the output is in the right format. Make sure that the tone and writing style is compelling, simple and concise",
    backstory="""You are an Expert at formatting text from technical writers. You can tell when a report text isn't concise,
    simple or engaging enough. You know how to make sure that text stays technical and insightful by using layman terms. You know how to format a report properly.
    """,
    verbose=True,
    allow_delegation=True,
    llm=llm,
)

task_formatter = Task(
    description="""
        Format the explorer's output for Twitter posting by structuring the information according to the following template:
        
        - For each news item, begin with the title.
        - Follow the title with a bullet point listing interesting facts about the news item.
        - Ensure each fact is compelling, engaging, and provides value to the reader.
        - Conclude each news item summary with a bullet point that includes a link to the original post.
        
        The output must be clearly structured, contain accurate and engaging information, and each news item must have a corresponding link. Your task is to verify the structure, assess the compelling nature of the text, and ensure all links are correctly included.
    """,
    agent=formatter,
    expected_output="""
        The formatted output should look like this for each news item:
        
        [Title of News Item]
        - An interesting and engaging fact about the news item. The fact should be compelling and present the news in an engaging way.
        - A direct link to the original post for readers to find more information.
        
        "Title of News Item" should be replaced with the actual title. This format should be repeated for each news item included in the explorer's output, ensuring a consistent and engaging reader experience.
    """,
)

Kickoff the Crew

# instantiate crew of agents
crew = Crew(
    agents=[explorer, formatter],
    tasks=[task_report, task_formatter],
    verbose=2,
    process=Process.sequential,  # Sequential process will have tasks executed one after the other and the outcome of the previous one is passed as extra content into this next.
)

# Get your crew to work!
result = crew.kickoff()

Post to X

Split the result into sections to respect the limit rules of X:

def split_section_into_tweets(section, limit=280):
    """Splits a section of the message into parts that fit within the tweet limit, accounting for URL shortening."""
    url_placeholder = "https://t.co/xxxxxxxxxx"  # Placeholder for URLs, adjust length as needed
    urls = re.findall(r'(https?://\S+)', section)
    adjusted_section = re.sub(r'(https?://\S+)', url_placeholder, section)  # Replace URLs with placeholder

    if len(adjusted_section) <= limit:
        return [section]  # Return the original section if it fits
    
    # Adjust the splitting logic to account for item markers
    item_pattern = re.compile(r'\d+\.\s')  # Pattern to detect item numbering
    sentences = adjusted_section.split('. ')
    tweets = []
    current_tweet = ""
    
    for sentence in sentences:
        if len(current_tweet) + len(sentence) + 1 > limit or item_pattern.match(sentence):
            if current_tweet:
                # Replace placeholders with actual URLs when adding tweet to list
                for url in urls:
                    current_tweet = current_tweet.replace(url_placeholder, url, 1)
                    urls = [u for u in urls if u != url]  # Remove used URL
                    break
                tweets.append(current_tweet)
                current_tweet = sentence
            else:
                # Directly add if alone it exceeds the limit
                tweets.append(sentence)
        else:
            if current_tweet:
                current_tweet += ". "
            current_tweet += sentence
    
    # Replace placeholders with actual URLs for the last tweet
    for url in urls:
        current_tweet = current_tweet.replace(url_placeholder, url, 1)
    if current_tweet:
        tweets.append(current_tweet)
    
    return tweets

Post the split Tweets

def post_to_twitter_callback(task_output):
    print("Callback function has been triggered.")
    # Initialize Tweepy client (assuming bearer_token, api_key, etc. are defined)
    client = tweepy.Client(bearer_token=bearer_token_tw,
                        consumer_key=api_key_tw,
                        consumer_secret=api_key_secret_tw,
                        access_token=access_token_tw,
                        access_token_secret=access_token_secret_tw)


    # Assuming 'result' contains the content to be tweeted
    tweets = split_section_into_tweets(task_output)
    total_tweets = len(tweets) + 1  # +1 for the initial header tweet
    sleep_duration = 0.4
    reply_to_tweet_id = None

    # Prepare the initial tweet
    today_date = datetime.datetime.now().strftime("%B %d, %Y")
    initial_tweet_text = f"Your Daily Crypto Updates of {today_date}  (1/{total_tweets})"

    try:
        # Post the initial header tweet
        response = client.create_tweet(text=initial_tweet_text)
        reply_to_tweet_id = response.data['id']
        print(f"Posted initial Tweet ID: {response.data['id']}")
        time.sleep(sleep_duration)
    except Exception as e:
        print(f"Error posting initial tweet: {e}")

    # Post the rest of the tweets
    for index, tweet in enumerate(tweets, start=2):  # Start from 2 since the first tweet is the header
        tweet_with_count = f"{tweet} ({index}/{total_tweets})"
        if len(tweet_with_count) > 270:
            allowed_length = 270 - len(f" ({index}/{total_tweets})") - 3
            tweet_with_count = tweet[:allowed_length] + "..." + f" ({index}/{total_tweets})"
            print(tweet_with_count)
        try:
            response = client.create_tweet(text=tweet_with_count, in_reply_to_tweet_id=reply_to_tweet_id)
            
            reply_to_tweet_id = response.data['id']
            
            print(f"Posted Tweet ID: {response.data['id']}")
            print(f'sleeping for {sleep_duration} sec...')
            time.sleep(sleep_duration)
        except Exception as e:
            print(f"Error posting tweet: {e}")

Results and Future Development

It’s running for three days now and the results can be find here:

As can be seen, formatting can be better. Nevertheless, interesting findings are posted over the last few days. One can imagine that such Agentic workflows open up opportunities for more complex tasks. In my Github, the full code can be found for the first iteration. My plan is to build this project out in the following way:

Set up three Crews:
- An Explorer Crew which consists of multiple Agents scraping the depths of the internet looking for Crypto developments (think about Telegram and Discord channels)
- A Formatter Crew which takes responsibility for formatting the scraped content into the desired format for a specific social media platform (think about a blog for Substack and a bullet-point list for Discord)
- A Social Media Crew which makes sure the formatted content is posted at a predetermined frequency to several social media channels.

In addition, currently, I run the main.py file once per day. Once I’m satisfied, I’ll place the code into a while loop and deploy it to a cloud service to let it run automatically.

Call to Action

I’ll encourage you to explore such Agentic Workflows yourself. These workflows start to get more and more traction. Try to automate something. Just play around with it.

Now, I invite you to dive into the world of Agentic Workflows on your own. As these powerful tools gain momentum in the tech world, there’s no better time than now to get your hands dirty. Challenge yourself to automate a task, big or small. Experiment, tinker, and see what you can create - it's still a playground for the curious. So, take the leap and start playing around with Agentic Workflows today. Who knows what you might discover or achieve?

Thanks,

Michiel & Agentic

The Application Layer