Introduction
This week I was tasked to refactor the ReadmeGenie. If you just arrived here, ReadmeGenie is my open-source project that uses AI to generate readmes based on the files that the user inputs.
Initially, my thoughts were, "The program is working fine. I’ve been developing it in an organized way since day one... so why change it?"
Well, after taking a week-long break from the project, I opened it up again and immediately thought, "What is this?"
Why refactor?
To give you some context, here’s an example: One of my core functions, which I once thought was perfect, turned out to be much more complex than necessary. During the refactoring process, I broke it down into five separate functions—and guess what? The code is much cleaner and easier to manage now.
Take a look at the original version of this function:
def generate_readme(file_paths, api_key, base_url, output_filename, token_usage): try: load_dotenv() # Check if the api_key was provided either as an environment variable or as an argument if not api_key and not get_env(): logger.error(f"{Fore.RED}API key is required but not provided. Exiting.{Style.RESET_ALL}") sys.exit(1) # Concatenate content from multiple files file_content = "" try: for file_path in file_paths: with open(file_path, 'r') as file: file_content += file.read() + "\\n\\n" except FileNotFoundError as fnf_error: logger.error(f"{Fore.RED}File not found: {file_path}{Style.RESET_ALL}") sys.exit(1) # Get the base_url from arguments, environment, or use the default chosenModel = selectModel(base_url) try: if chosenModel == 'cohere': base_url = os.getenv("COHERE_BASE_URL", "https://api.cohere.ai/v1") response = cohereAPI(api_key, file_content) readme_content = response.generations[0].text.strip() + FOOTER_STRING else: base_url = os.getenv("GROQ_BASE_URL", "https://api.groq.com") response = groqAPI(api_key, base_url, file_content) readme_content = response.choices[0].message.content.strip() + FOOTER_STRING except AuthenticationError as auth_error: logger.error(f"{Fore.RED}Authentication failed: Invalid API key. Please check your API key and try again.{Style.RESET_ALL}") sys.exit(1) except Exception as api_error: logger.error(f"{Fore.RED}API request failed: {api_error}{Style.RESET_ALL}") sys.exit(1) # Process and save the generated README content if readme_content[0] != '*': readme_content = "\n".join(readme_content.split('\n')[1:]) try: with open(output_filename, 'w') as output_file: output_file.write(readme_content) logger.info(f"README.md file generated and saved as {output_filename}") logger.warning(f"This is your file's content:\n{readme_content}") except IOError as io_error: logger.error(f"{Fore.RED}Failed to write to output file: {output_filename}. Error: {io_error}{Style.RESET_ALL}") sys.exit(1) # Save API key if needed if not get_env() and api_key is not None: logger.warning("Would you like to save your API key and base URL in a .env file for future use? [y/n]") answer = input() if answer.lower() == 'y': create_env(api_key, base_url, chosenModel) elif get_env(): if chosenModel == 'cohere' and api_key != os.getenv("COHERE_API_KEY"): if api_key is not None: logger.warning("Would you like to save this API Key? [y/n]") answer = input() if answer.lower() == 'y': create_env(api_key, base_url, chosenModel) elif chosenModel == 'groq' and api_key != os.getenv("GROQ_API_KEY"): if api_key is not None: logger.warning("Would you like to save this API Key? [y/n]") answer = input() if answer.lower() == 'y': create_env(api_key, base_url, chosenModel) # Report token usage if the flag is set if token_usage: try: usage = response.usage logger.info(f"Token Usage Information: Prompt tokens: {usage.prompt_tokens}, Completion tokens: {usage.completion_tokens}, Total tokens: {usage.total_tokens}") except AttributeError: logger.warning(f"{Fore.YELLOW}Token usage information is not available for this response.{Style.RESET_ALL}") logger.info(f"{Fore.GREEN}File created successfully") sys.exit(0)
1. Eliminate Global Variables
Global variables can lead to unexpected side effects. Keep the state within the scope it belongs to, and pass values explicitly when necessary.
2. Use Functions for Calculations
Avoid storing intermediate values in variables where possible. Instead, use functions to perform calculations when needed—this keeps your code flexible and easier to debug.
3. Separate Responsibilities
A single function should do one thing, and do it well. Split tasks like command-line argument parsing, file reading, AI model management, and output generation into separate functions or classes. This separation allows for easier testing and modification in the future.
4. Improve Naming
Meaningful variable and function names are crucial. When revisiting your code after some time, clear names help you understand the flow without needing to re-learn everything.
5. Reduce Duplication
If you find yourself copying and pasting code, it’s a sign that you could benefit from shared functions or classes. Duplication makes maintenance harder, and small changes can easily result in bugs.
Commiting and pushing to GitHub
1. Create a branch
I started by creating a branch using:
git checkout -b <branch-name> </branch-name>
This command creates a new branch and switches to it.
2. Making a Series of Commits
Once on the new branch, I made incremental commits. Each commit represents a logical chunk of work, whether it was refactoring a function, fixing a bug, or adding a new feature. Making frequent, small commits helps track changes more effectively and makes it easier to review the history of the project.
git status git add <file_name> git commit -m "Refactored function" </file_name>
3. Rebasing to Keep a Clean History
After making several commits, I rebased my branch to keep the history clean and linear. Rebasing allows me to reorder, combine, or modify commits before they are pushed to GitHub. This is especially useful if some of the commits are very small or if I want to avoid cluttering the commit history with too many incremental changes.
git rebase -i main
In this step, I initiated an interactive rebase on top of the main branch. The -i flag allows me to modify the commit history interactively. I could squash some of my smaller commits into one larger, cohesive commit. For instance, if I had a series of commits like:
Refactor part 1
Refactor part 2
Fix bug in refactor
I could squash them into a single commit with a clearer message
4. Pushing Changes to GitHub
Once I was satisfied with the commit history after the rebase, I pushed the changes to GitHub. If you’ve just created a new branch, you’ll need to push it to the remote repository with the -u flag, which sets the upstream branch for future pushes.
git push -u origin <branch-name> </branch-name>
5. Merging
In the last step I did a fast-forward merge to the main branch and pushed again
git checkout main # change to the main branch git merge --ff-only <branch-name> # make a fast-forward merge git push origin main # push to the main </branch-name>
Takeaways
Everything has room to improve. Refactoring may seem like a hassle, but it often results in cleaner, more maintainable, and more efficient code. So, the next time you feel hesitant about refactoring, remember: there’s always a better way to do things.
Even though I think it's perfect now, I will definitely have something to improve on my next commit.
The above is the detailed content of Refactoring ReadmeGenie. For more information, please follow other related articles on the PHP Chinese website!

This tutorial demonstrates how to use Python to process the statistical concept of Zipf's law and demonstrates the efficiency of Python's reading and sorting large text files when processing the law. You may be wondering what the term Zipf distribution means. To understand this term, we first need to define Zipf's law. Don't worry, I'll try to simplify the instructions. Zipf's Law Zipf's law simply means: in a large natural language corpus, the most frequently occurring words appear about twice as frequently as the second frequent words, three times as the third frequent words, four times as the fourth frequent words, and so on. Let's look at an example. If you look at the Brown corpus in American English, you will notice that the most frequent word is "th

This article explains how to use Beautiful Soup, a Python library, to parse HTML. It details common methods like find(), find_all(), select(), and get_text() for data extraction, handling of diverse HTML structures and errors, and alternatives (Sel

This article compares TensorFlow and PyTorch for deep learning. It details the steps involved: data preparation, model building, training, evaluation, and deployment. Key differences between the frameworks, particularly regarding computational grap

Serialization and deserialization of Python objects are key aspects of any non-trivial program. If you save something to a Python file, you do object serialization and deserialization if you read the configuration file, or if you respond to an HTTP request. In a sense, serialization and deserialization are the most boring things in the world. Who cares about all these formats and protocols? You want to persist or stream some Python objects and retrieve them in full at a later time. This is a great way to see the world on a conceptual level. However, on a practical level, the serialization scheme, format or protocol you choose may determine the speed, security, freedom of maintenance status, and other aspects of the program

Python's statistics module provides powerful data statistical analysis capabilities to help us quickly understand the overall characteristics of data, such as biostatistics and business analysis. Instead of looking at data points one by one, just look at statistics such as mean or variance to discover trends and features in the original data that may be ignored, and compare large datasets more easily and effectively. This tutorial will explain how to calculate the mean and measure the degree of dispersion of the dataset. Unless otherwise stated, all functions in this module support the calculation of the mean() function instead of simply summing the average. Floating point numbers can also be used. import random import statistics from fracti

In this tutorial you'll learn how to handle error conditions in Python from a whole system point of view. Error handling is a critical aspect of design, and it crosses from the lowest levels (sometimes the hardware) all the way to the end users. If y

The article discusses popular Python libraries like NumPy, Pandas, Matplotlib, Scikit-learn, TensorFlow, Django, Flask, and Requests, detailing their uses in scientific computing, data analysis, visualization, machine learning, web development, and H

This tutorial builds upon the previous introduction to Beautiful Soup, focusing on DOM manipulation beyond simple tree navigation. We'll explore efficient search methods and techniques for modifying HTML structure. One common DOM search method is ex


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Dreamweaver Mac version
Visual web development tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Notepad++7.3.1
Easy-to-use and free code editor

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

SublimeText3 Mac version
God-level code editing software (SublimeText3)
