Skip to content

Zaharadniuk_Aliaksei_fiz.zagorodnAA@gmail.com#2

Open
AlexSpaceBy wants to merge 7 commits into
epam-python-courses-7-bsu:masterfrom
AlexSpaceBy:AliakseiZaharadniuk
Open

Zaharadniuk_Aliaksei_fiz.zagorodnAA@gmail.com#2
AlexSpaceBy wants to merge 7 commits into
epam-python-courses-7-bsu:masterfrom
AlexSpaceBy:AliakseiZaharadniuk

Conversation

@AlexSpaceBy

Copy link
Copy Markdown

Первая версия программы RssReader. Так как создание программы (и ее логическая архитектура) делалась на протяжении предыдущей недели, то окончательный вид (расположение папок, название файлов) не совпадает с тем, что было указано в требованиях README.md. В файле README.txt и README.md имеется полное описание работы программы, указания для установки пакета программы (итерация 2, папка /dist, zip архив), и приведена структура программы. Модуль с unit-тестами будет добавлен позже (планирую его добавить после третьей итерации). В программе используются две сторонние библиотеки: feedparser.py и html2text (пакет из нескольких модулей). Данные библиотеки уже настроены для использования с программой, поэтому рекомендуется использовать именно их, а не аналогичные но скачанные из других репозиториев.

Comment thread RssReader/args_parser.py Outdated
import sys


def get_parse(args_in=''):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below:
It's a good practice to add type hits in your function signatures (e.g. def get_parse(args_in='': str) -> Dict[str, Any]:

Please, take a loot at typing lib

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All lines were found and fixed where appropriate. Will be updated during the next "push" iteration.

Comment thread RssReader/args_parser.py Outdated
logs.log_err_exit()
parser.exit()

return {'url': args.url, 'json': args.json, 'verbose': args.verbose, 'limit': args.limit}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, take a look at dataclasses lib. It might be better to replace this dict with dataclass object.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it may touch the fundamental logic of the program, I'll try to implement dataclasses when I have a working program with at least first four complete iterations.

Comment thread RssReader/args_parser.py Outdated
Comment thread RssReader/args_parser.py Outdated
Comment thread RssReader/logs.py Outdated
Comment thread RssReader/logs.py Outdated
Comment thread RssReader/rss_parser.py Outdated
"""This function receives the answer from server"""

news = feedparser.parse(url)
if news.entries:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return news if news.entries else None

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line was found and fixed. It will be updated during the next "push" iteration.

Comment thread RssReader/rss_parser.py Outdated

for i in range(3):
try:
a = urllib.request.urlopen(url).getcode()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, do not use single-charachter variables names.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked all one letter variables in the program and fixed it. It will be updated during the next "push" iteration.

Comment thread RssReader/rss_parser.py Outdated
In case of failure, it reconnects in 10 seconds
"""

for i in range(3):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic will retry to get a response from server if status code is other than 200. But it's is useless to retry for example 400, 500 HTTP errors, invalid urls and some. You can think about more optimal solution. urllib has it's own Retry included.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function was fixed. It will be updated during the next "push" iteration.

@HenadziStantchik

HenadziStantchik commented Nov 3, 2019

Copy link
Copy Markdown
Collaborator

Так как создание программы (и ее логическая архитектура) делалась на протяжении предыдущей недели, то окончательный вид (расположение папок, название файлов) не совпадает с тем, что было указано в требованиях README.md.

Please make sure that project structure is the same as specified in our README.md. This is important for CI checks that will be added in future. Note that we only require that application entry point shold be in final_task/rss_reader/rss_reader.py and setup.py file for setuptools should be in final_task/setup.py. Other than that you are free to make your project structure as you see fit.

В файле README.txt и README.md имеется полное описание работы программы, указания для установки пакета программы (итерация 2, папка /dist, zip архив), и приведена структура программы.

Your documentation looks great, but you were supposed to use separate file final_task/README.md for it, not file containing instructions for PR creation. Please move your documentation there.

В программе используются две сторонние библиотеки: feedparser.py и html2text (пакет из нескольких модулей). Данные библиотеки уже настроены для использования с программой, поэтому рекомендуется использовать именно их, а не аналогичные но скачанные из других репозиториев.

While we allow using 3rd party libraries in your final project, it is realy bad practice to copy its code directly, even if you changed it a little. Usually you want to use it through imports. Or if you need to change its functionality a little, you should implement your own class that inherits the one in the library and reimplement required behavior there.

Please fix these issues, especially the last one,

@HenadziStantchik HenadziStantchik added reviewed Review was performed question Further information is requested and removed question Further information is requested reviewed Review was performed labels Nov 5, 2019
@AlexSpaceBy

Copy link
Copy Markdown
Author

Новая версия программы. Включает вторую и третью итерацию. Структура программы была изменена, чтобы полностью соответствовать требованиям, внесены изменения в код программы. Исправлено большинство замечаний. Полностью переделаны некоторые функции. Добавлена опция записи новостей в локальное хранилище, извлечение новостей из локального хранилища для последующей печати в консоль. Переписан setup.py файл, удалены библиотеки feedparser, html2text. Программа при установке автоматически скачивает новые и необходимые библиотеки для корректной работы. Работоспособность была проверена на Windows 10 (1903) и на Linux Fedora 30 с python версии 3.8

Comment thread README.md Outdated
Comment thread rss_reader/args_parser.py Outdated
Comment thread rss_reader/rss_reader.py Outdated
@HenadziStantchik

Copy link
Copy Markdown
Collaborator
  1. Project structure still not exactly corresponds to the requirements. rss_reader folder is supposed to be inside final_task folder.

  2. Please describe the way to launch your program in your documentation, because I could not do it. I constantly got ModuleNotFoundError

@AlexSpaceBy

Copy link
Copy Markdown
Author

Версия 3.1 (первые три итерации). Исправлена ошибка ModuleNotFoundError добавлением пути поиска файлов в переменные среды. Исправлены все замечания с import, изменена структура программы. В файл README.MD добавлена глава запуска приложения с примерами команд. Описан запуск "сырого" не установленного пакета из каталога rss_reader, сборка пакета для установки, установка пакета, запуск пакета при его установке.

Comment thread final_task/rss_reader/rss_reader.py Outdated
@HenadziStantchik

Copy link
Copy Markdown
Collaborator

Here is feedback on your application work:

  1. If user does not specify limit then he gets only 1 piece of news. He/she should get all available feed in this case.

  2. --verbose argument is supposed to output logs in the process of script running, not after it completes its work.

  3. Date: in the news is supposed to be publication date, not the date/time when you parsed these news.
    The same is for storing the news in cache.

  4. --limit does not work with --date argument

  5. If you specify limit larger than feed size you will get IndexError exception with traceback printed in stdout.

  6. Your news cache just appends new entries to the end of the file, even if the same news already were in it. As a result if I run your app 5 times with --limit 1 and then run it with --date <current_date> i will get 5 times the same piece of news.

3rd Iteration is not that trivial as it looks, specifically because of that we are asking you to describe your method of storing cached news.

except SystemExit:
logs.log_err_init_args(args_in)
logs.log_err_exit()
parser.exit()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below: It is better to have only one exit point in the program. For example you can raise your custom exception here and handle it in the main function.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was fixed for major cycles. The rest will be fixed after fourth iteration.

Comment thread final_task/rss_reader/args_parser.py
flag = True
except FileNotFoundError:
print('Error: File not found. (Maybe this is a first time you are running a program)')
return 1

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below: while sometimes it is common practice to return your custom 'statuses' it reassly dampens code readability. It is better to raise/reraise an exception here and handle it in the main thread. If everythoong went good just return True.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be fixed after the fourth iteration.

Comment thread final_task/rss_reader/other.py Outdated
@AlexSpaceBy

Copy link
Copy Markdown
Author

Версия программы 3.5 Исправлены основные замечания по флагам limit, date, verbose, а так же их комбинации. Была дополнена третья итерация: прежде чем записать новость, программа проверяет наличие дубликата внутри журнала. Была сделана четвертая итерация: добавлено конвертирование в формат pdf согласно недавним рекомендациям в slack. Использование конвертации подробно описано в README.md

Comment thread final_task/rss_reader/rss_parser.py Outdated
Comment thread final_task/rss_reader/rss_parser.py Outdated
# Everything deals with time delay will be repeated
except urllib.error.HTTPError as http_err:
if http_err.code in (503, 504, 522, 524):
print('')

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, above and below: It is better to think of some other way to make an empty line instead of printing ''

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be fixed during final commit.

Comment thread final_task/rss_reader/news.py Outdated
@HenadziStantchik

Copy link
Copy Markdown
Collaborator

Here is some feedback on your application:

  1. --date does not work as expected: date user specifies is supposed to be publication date, not parsing date:

zaharadnik

  1. --verbose argument is supposed to print logs together with news, not instead of them.

  2. Currently it is impossible to read news from local cache without specifying RSS url.

  3. ARIALUNI.TTF file not found during converting news to pdf format. got an exception with traceback in stdout.

  4. Ensure that you have mo encoding problems in converted pdf file.

  5. --verbose in tandem with -p is not supposed to convert log journal to pdf, just to print logs as you convert news to pdf in stdout.

  6. You should convert news from local cache instead of internet source only when user specified --date option. Right now you do it every time.

@AlexSpaceBy

AlexSpaceBy commented Nov 20, 2019

Copy link
Copy Markdown
Author

Четвертая итерация, добавлено конвертирование в формат html. Улучшено конвертирование в формат pdf. Исправлена ошибка с отображением даты, улучшен вывод логов (теперь логи печатаются с новостями). Улучшен процесс конвертации новостей из локального хранилища (теперь для конвертации не нужен Интернет). Исправлена ошибка с шрифтами при конвертировании новостей. Улучшен ввод аргументов (теперь для использования программы не обязательно вводить url для позиционного аргумента). Исправлен ряд замечаний относительно кода.
p.s.
Из-за ошибки при добавлении коммита в ветку, было отправлено два одинаковых коммита с разницей в несколько минут. При рецензировании прошу учитывать только последний коммит.

@AlexSpaceBy AlexSpaceBy changed the title Zaharadniuk_Aliaksei_fiz.zagorodnAA@gmail.com Zaharadniuk_Aliaksei_fiz.zagorodnAA@gmail.com Please Review Nov 20, 2019
@HenadziStantchik

Copy link
Copy Markdown
Collaborator

Из-за ошибки при добавлении коммита в ветку, было отправлено два одинаковых коммита с разницей в несколько минут. При рецензировании прошу учитывать только последний коммит.

You can squash these commits

@HenadziStantchik

HenadziStantchik commented Nov 21, 2019

Copy link
Copy Markdown
Collaborator

Here is some feedback on your app work:

  1. Your app has encoding problems:

zaharadnik

Same for --json, and in pdf converted file

  1. --date does not work with --json

  2. HTML converted file has encoding problems. (Firefox browser, Ubuntu OS):
    zaharadnik

  3. (optional) It would be nice to make pdf converted file name same as html converted file name (e.g. with date)

  4. Logs should not be converted to pdf or html. If --verbose option is specified then logs should be printed in stdout as the conversion goes.

Comment on lines +44 to +50
return {'url': args.url,
'json': args.json,
'verbose': args.verbose,
'limit': args.limit,
'date': args.date,
'pdf': args.pdf,
'html': args.html}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question:

You already have all arguments inside args object. Do you really need to make this dict here? Why not just pass args everywhere instead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this data in the code in rss_reader.py module. It is more convenient for me to use it in this form. In my opinion it make my code easy to read.

Comment thread final_task/rss_reader/args_parser.py Outdated
Comment thread final_task/rss_reader/args_parser.py Outdated
Comment thread final_task/rss_reader/converter.py Outdated
Comment thread final_task/rss_reader/converter.py Outdated
Comment thread final_task/rss_reader/converter.py Outdated
Comment on lines +95 to +98
pdf = FPDF()
pdf.set_font('Arial', 'B', size=14)
pdf.set_margins(10, 10, 10)
pdf.add_page(('P', 'A4'))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You had the same chunk of code above, it is better to move it to a separate function

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code above is not the same as above. The main difference is the font: arial_uni for the first case, Arial for the second case.

Comment thread final_task/rss_reader/logs.py Outdated
Comment on lines +88 to +89
url = line[73:]
info = line[0:66]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These numbers are confusingly specific, please leave a comment here why exactly it is 73 and 66

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function deals with logs, that has specific template: first 66 characters for technical information data, the rest for url. It is convenient for me to slice it this way.

Comment on lines +188 to +191
def log_log_html():
"""This function logs the conversion of log journal to html"""

logging.info('Log journal was converted to html')

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using logging decorator instead of this module.

Comment thread final_task/rss_reader/other.py Outdated
Comment on lines +137 to +153
def convert_date(date: str):
"""This function converts date"""
month = {'Jan': '1',
'Feb': '2',
'Mar': '3',
'Apr': '4',
'May': '5',
'Jun': '6',
'Jul': '7',
'Aug': '8',
'Sep': '9',
'Oct': '10',
'Nov': '11',
'Dec': '12'}
day = date[5:7]
month_num = month[date[8:11]]
year = date[12:16]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same: datetime lib

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the data from argsparse is a string, it is more convenient to use separate function instead of datetime.

@HenadziStantchik HenadziStantchik Dec 3, 2019

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is still redundant. It is better to use existing optimized solution that works in any situation instead of implementing new one

Comment thread final_task/rss_reader/rss_reader.py Outdated
logs.new_session()
commands = args_parser.get_parse()

while True:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to just remove user input above and get rid of this while completely. This is not how app is supposed to work

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure of the program was fixed, there is no more while cycle. It will be updated during the next commit.



if __name__ == '__main__':
main()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main function is too large. It is better to split it into a smaller functions

@HenadziStantchik HenadziStantchik changed the title Zaharadniuk_Aliaksei_fiz.zagorodnAA@gmail.com Please Review Zaharadniuk_Aliaksei_fiz.zagorodnAA@gmail.com Nov 21, 2019
@AlexSpaceBy

Copy link
Copy Markdown
Author

Финальная версия программы RSS Reader (четыре итерации) - исправлены основные замечания, изменена логика программы, дополнен функционал.

data['link'] = rss.entries[limit].link
data['title'] = rss.entries[limit].title

"""news_date for third iteration, yyyy-mm-dd"""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are usually written using #

if data['image'] is not None:
print('Image: ' + data['image'])
print('. . . . . . . . . . . . . . . . . . . . . .')
print('')

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to think of some other way for printing an empty line instead of using print("")

@HenadziStantchik

Copy link
Copy Markdown
Collaborator

Here is feedback on your application:
Iteration 1:
Everything works fine.
Iteration 2:
Installs fine. All iterations work.
Iteration 3:
Everything works fine.
Iteration 4:
Everything works fine.

The code is readable, decently structured, mostly corresponds to the PEP8 guidelines.
Tests are decent, but the coverage percentage is not high enough.
Commit messages are somewhat informative, but do not correspond to the recommended commit message style.
Overall aside from some commented issues, good job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants